The Short Time Fourier Transform (STFT) is defined as a time-ordered sequence of DTFTs, and implemented in practice as a sequence of FFTs (see §7.1). Thus, the signal basis functions are naturally defined as the DFT-sinusoids multiplied by time-shifted windows, suitably normalized for unit norm:
When successive windows overlap (i.e., the hop size is less than the window length ), the basis functions are not orgthogonal. In this case, we may say that the basis set is overcomplete.
The basis signals are orthonormal when and the rectangular window is used ( ). That is, two rectangularly windowed DFT sinusoids are orthogonal when either the frequency bin-numbers or the time frame-numbers differ, provided that the window length equals the number of DFT frequencies (no zero padding). In other words, we obtain an orthogonal basis set in the STFT when the hop size, window length, and DFT length are all equal (in which case the rectangular window must be used to retain the perfect-reconstruction property). In this case, we can write
In the overcomplete case, we get a special case of weighted