The Short-Time Fourier Transform (STFT) and Time-Frequency Displays

The Short-Time Fourier Transform (STFT) and
Time-Frequency Displays

Often we simply want to display sound as a spectrum that evolves through time. We know that this is what the brain ``sees'' when we hear sound. The classic spectrogram, developed at Bell Telephone Laboratories during World War II, has been used for decades to display the short-time spectrum of sound. There are even people who can ``read'' a spectrogram of speech. In Chapter 7, the classic spectrogram is reviewed, and development of more refined ``loudness spectrograms'' based on psychoacoustic research in loudness perception are discussed. These more refined spectrograms come closer to goal of ``what you see is what you hear''.

Since the proliferation of digital computers, spectrograms have been computed using the Short-Time Fourier Transform (STFT), which is simply a sequence of FFTs over time. In Chapter 7, the STFT is introduced.

[How to cite this work] [Order a printed hardcopy] [Comment on this page via email]

``Spectral Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2011, ISBN 978-0-9745607-3-1.
Copyright © 2022-02-28 by Julius O. Smith III
Center for Computer Research in Music and Acoustics (CCRMA), Stanford University