Next  |  Prev  |  Up  |  Top  |  Index  |  JOS Index  |  JOS Pubs  |  JOS Home  |  Search

STFT of COLA Decomposition

To represent practical FFT implementations, it is preferable to shift the $ m^{th}$ frame back to the time origin:

$\displaystyle {\tilde x}_m(n) \isdef x_m(n+mR) \eqsp \hbox{\sc Shift}_{-mR,n}(x_m)$ (9.20)

This is summarized in Fig.8.11. Zero-based frames are needed because the leftmost input sample is assigned to time zero by FFT algorithms. In other words, a hopping FFT effectively redefines time zero on each hop. Thus, a practical STFT is a sequence of FFTs of the zero-based frames $ {\tilde x}_0, {\tilde x}_1, \ldots$ . On the other hand, papers in the literature (such as [7,9]) work with the fixed time-origin case ( $ x_0, x_1,
\ldots$ ). Since they differ only by a time shift, it is not hard to translate back and forth.


\begin{psfrags}
% latex2html id marker 22379\psfrag{x}{$x$}\psfrag{Zero-centered 3rd frame x_3: M = 64, R = M/2}%
{\normalsize Zero-centered 3rd frame $x_3$: $M = 64$, $R = M/2$}\psfrag{x_3}{$x_3$} % doesn't work\psfrag{xtilde_3}{${\tilde x}_3$}\begin{figure}[htbp]
\includegraphics[width=\twidth]{eps/shiftwin}
\caption{Input signal $x$\ (top), third frame
$x_3$\ in its natural time location (middle), and the third frame
shifted to time 0, ${\tilde x}_3$\ (bottom).}
\end{figure}
\end{psfrags}

Note that we may sample the DTFT of both $ x_m$ and $ {\tilde x}_m$ , because both are time-limited to $ M$ nonzero samples. The minimum information-preserving sampling interval along the unit circle in both cases is $ \Omega_M \isdeftext 2\pi/M$ . In practice, we often oversample to some extent, using $ \Omega_N$ with $ N>M$ instead. For $ {\tilde x}_m$ , we get

$\displaystyle {\tilde X}_m(\omega_k)$ $\displaystyle \isdef$ $\displaystyle \hbox{\sc Sample}_{\Omega_N,k}\left(\hbox{\sc DTFT}\left({\tilde x}_m\right)\right)$  
  $\displaystyle =$ $\displaystyle \hbox{\sc DFT}_{N,k}({\tilde x}_m),
\protect$ (9.21)

where $ \omega_k \isdef 2\pi k/N = k\Omega_N$ . For $ x_m$ we have

\begin{eqnarray*}
X_m(\omega_k) &\isdef & \hbox{\sc Sample}_{\Omega_N,k}\left(\hbox{\sc DTFT}(x_m)\right)\\
&\longleftrightarrow& \hbox{\sc Alias}_N(x_m)\\
&\neq& {\tilde x}_m \; \hbox{(in general).}
\end{eqnarray*}

Since $ {\tilde x}_m = \hbox{\sc Shift}_{-mR}(x_m)$ , their transforms are related by the shift theorem:

\begin{eqnarray*}
{\tilde X}_m(\omega_k) &=& e^{jmR\omega_k} X_m(\omega_k) \\
\longleftrightarrow\quad
{\tilde x}_m(n) &=& \hbox{\sc Alias}_{N,n+mR}(x_m)\\
&=& x_m(n+mR)_N
\end{eqnarray*}

where $ (n+mR)_N$ denotes modulo $ N$ indexing (appropriate since the DTFTs have been sampled at intervals of $ \Omega_N = 2\pi/N$ ).


Next  |  Prev  |  Up  |  Top  |  Index  |  JOS Index  |  JOS Pubs  |  JOS Home  |  Search

[How to cite this work]  [Order a printed hardcopy]  [Comment on this page via email]

``Spectral Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2011, ISBN 978-0-9745607-3-1.
Copyright © 2022-02-28 by Julius O. Smith III
Center for Computer Research in Music and Acoustics (CCRMA),   Stanford University
CCRMA