TSM by Resampling STFTs Across Time

In view of Chapter 8, a natural implementation of TSM based on the STFT is as follows:

- Perform a short-time Fourier transform (STFT) using hop size
. Denote the STFT at frame
and bin
by
, and
denote the result of TSM processing by
.
- To perform TSM by the factor
, advance the ``frame
pointer''
by
during resynthesis instead of the usual
samples.

For example, if
(
slow-down), the first STFT frame
is processed normally, so that
. However, the
second output frame
corresponds to a time
, half way
between the first two frames. This output frame may be created by
*interpolating* (across time) the STFT magnitude magnitude
spectra of the first. For example, using simple linear interpolation
gives

(11.23) |

where the phase is chosen to preserve continuity and/or the amplitude envelope from frame to frame under the overlap-add (more on this below). Generalizing to arbitrary TSM factors , we obtain

(11.24) |

where , and is advanced by each frame-step.

In general, TSM methods based on STFT modification are classified as
``vocoder'' type methods (§G.5). Thus, the TSM
implementation outlined above may be termed a
*weighted overlap-add (WOLA) phase-vocoder*
method.

[How to cite this work] [Order a printed hardcopy] [Comment on this page via email]

Copyright ©

Center for Computer Research in Music and Acoustics (CCRMA), Stanford University