As we have seen, sinusoids efficiently model spectral peaks over time, and filtered noise efficiently models the spectral residual left over after pulling out everything we want to call a ``tonal component'' characterized by a spectral peak the evolves over time. However, neither is good for abrupt transients in a waveform. At transients, one may retain the original waveform or some compressed version of it (e.g., MPEG-2 AAC with short window [149]). Alternatively, one may switch to a transient model during transients. Transient models have included wavelet expansion [6] and frequency-domain LPC (time-domain amplitude envelope) [290].
In either case, a reliable transient detector is needed. This can raise deep questions regarding what a transient really is; for example, not everyone will notice every transient as a transient, and so perceptual modeling gets involved. Missing a transient, e.g., in a ride-cymbal analysis, can create highly audible artifacts when processing heavily based on transient decisions. For greater robustness, hybrid schemes can be devised in which a continuous measure of ``transientness'' can be defined between 0 and 1, say.
Also in either case, the sinusoidal model needs phase matching when switching to or from a transient frame over time (or cross-fading can be used, or both). Given sufficiently many sinusoids, phase-matching at the switching time should be sufficient without cross-fading.