Next  |  Prev  |  Up  |  Top  |  Index  |  JOS Index  |  JOS Pubs  |  JOS Home  |  Search


Multiresolution STFT

Figure 7.4 shows a multiresolution STFT for the same speech signal that was analyzed to produce Fig.7.2. The bandlimits in Hz for the five combined FFTs were $ [0,80,500,1250,2540,4050,(10000,22050)]$ , where the last two (in parentheses) were not used due to the signal sampling rate being only $ 8$ kHz. The corresponding window lengths in milliseconds were $ [64,32,16,8,4,(2,2)]$ , where, again, the last two are not needed for this example. Our hop size is chosen to be 1 ms, giving 75% overlap in the highest-frequency channel, and more overlap in lower-frequency channels. Thus, all frequency channels are oversampled along the time dimension. Since many frequency channels from each FFT will be combined via smoothing to form the ``excitation pattern'' (see next section), temporal oversampling is necessary in all channels to avoid uneven weighting of data in the time domain due to the hop size being too large for the shortened effective time-domain windows.

Figure 7.4: Multiresolution Short-Time Fourier Transform (MRSTFT). Compare to the fixed-resolution STFT in Fig.7.2.
\includegraphics[width=\twidth]{eps/lsg-mrstft}


Next  |  Prev  |  Up  |  Top  |  Index  |  JOS Index  |  JOS Pubs  |  JOS Home  |  Search

[How to cite this work]  [Order a printed hardcopy]  [Comment on this page via email]

``Spectral Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2011, ISBN 978-0-9745607-3-1.
Copyright © 2022-02-28 by Julius O. Smith III
Center for Computer Research in Music and Acoustics (CCRMA),   Stanford University
CCRMA