When interpolating the STFT across time for TSM, it is straightforward to interpolate spectral magnitude, as we saw above. Interpolating spectral phase, on the other hand, is tricky, because there's no exact way to do it . There are two conflicting desiderata when deciding how to continue phase from one frame to the next: To satisfy condition (1), it is necessary to replace the original phase of each frame by the phase corresponding to smooth continuation across time from the previous frame (which is generally an interpolated frame) for each FFT bin. Altering the phase of a spectral frame changes its amplitude envelope in the time domain. Thus, it may no longer looks like a windowed signal segment. Using the WOLA framework (§8.6) helps because the post-window guarantees a smooth cross-fade from frame to frame. Random amplitude-modulation distortion is generally heard as reverberation, also called phasiness .
When condition (2) is violated, the signal frame suffers dispersion in the time domain. For steady-state signals (filtered noise and/or steady tones), temporal dispersion should not be audible, while frames containing distinct pulses will generally become more ``smeared out'' in time.
It is not possible in general to satisfy both conditions (1) and (2) simultaneously, but either can be satisfied at the expense of the other. Generally speaking, ``transient frames'' should emphasize condition (2), allowing the WOLA overlap-add cross-fade to take care of the phase discontinuity at the frame boundaries. For stationary segments, phase continuation, preserving condition (1), is more valuable.
It is often sufficient to preserve relative phase across FFT bins (i.e., satisfy condition (2)) only along spectral peaks and their immediate vicinity [142,143,141,138,215,238].