The Fourier duality of the overlap-add and filter-bank-summation
short-time Fourier transform (discussed in Chapter 9) appeared in
the late 1970s [7,9]. This
unification of downsampled filter-banks and FFT processors spawned
considerable literature in STFT processing
[158,8,218,192,98,191].
While the phase vocoder is normally regarded as a fixed bandpass
filter bank. The STFT, in contrast, is usually regarded as a
time-ordered sequence of overlapping FFTs (the ``overlap-add''
interpretation). Generally speaking, sound reconstruction by STFT
during this period was nonparametric. A relatively exotic example was
signal reconstruction from STFT magnitude data (*magnitude-only
reconstruction*)
[103,192,218,20].

In the speech-modeling world, parametric sinusoidal modeling of the STFT apparently originated in the context of the magnitude-only reconstruction problem [220].

Since the phase vocoder was in use for measuring amplitude and
frequency envelopes for additive synthesis no later than
1977,^{G.12}it is natural to expect that parametric ``inverse FFT synthesis'' from
sinusoidal parameters would have begun by that time. Instead,
however, traditional banks of sinusoidal (and more general wavetable)
oscillators remained in wide use for many more years. Inverse FFT
synthesis of sound was apparently first published in
1980 [35]. Thus, parametric reductions of STFT data (in
the form of instantaneous amplitude and frequency envelopes of vocoder
filter-channel data) were in use in the 1970s, but we were not yet
resynthesizing sound by STFT using spectral buffers synthesized from
parameters.

[How to cite this work] [Order a printed hardcopy] [Comment on this page via email]

[Lecture Video] [Exercises] [Examination]

Copyright ©

Center for Computer Research in Music and Acoustics (CCRMA), Stanford University