FFT versus Direct Convolution

Using the Matlab test program in
[#!MDFT!#],^{9.1}FFT convolution was found to be faster than direct convolution
starting at length
(looking only at powers of 2 for the
length
).^{9.2} FFT convolution was also never
significantly slower at shorter lengths for which ``calling overhead''
dominates.

Running the same test program in 2011,^{9.3} FFT convolution using the
`fft` function was found to be faster than `conv` for
*all* (power-of-2) lengths. The speed of FFT convolution divided
by that of direct convolution started out at 14 for
, fell to a
minimum of
at
, above which it started to climb as
expected, reaching
at
. Note that this
comparison is unfair because the Octave `fft` function is a
dynamically linked, separately compiled module, while `conv` is
written in the matlab language and thus suffers more overhead from the
matlab interpreter.

An analysis reported in Strum and Kirk [#!StrumAndKirk!#, p. 521],
based on the number of real multiplies, predicts that the `fft`
is faster starting at length
, and that direct convolution is
significantly faster for very short convolutions (*e.g.*, 16 operations
for a direct length-4 convolution, versus 176 for the `fft`
function).

See
[#!MDFT!#]^{9.4}for further discussion of FFT algorithms and their applications.

In digital audio, FIR filters are often hundreds of taps long. For such filters, the FFT method is much faster than direct convolution in the time domain on single CPUs. On GPUs, FFT convolution is faster than direct convolution only for much longer FIR-filter lengths (in the thousands of taps [#!LauriEtAlAES11!#]); this is because massively parallel hardware can perform an algorithm (direct convolution) faster than a single CPU can perform an algorithm (FFT convolution).

[How to cite this work] [Order a printed hardcopy] [Comment on this page via email]

Copyright ©

Center for Computer Research in Music and Acoustics (CCRMA), Stanford University