Restricting Aliasing to Stop-Bands

To eliminate the relatively heavy transition-band aliasing (when
critically sampling the channel signals), we can define
*overlapping bands* such that each band encompasses the
transition bands on either side. However, unless a full
oversampling is provided for each band (which is one easy solution),
the bandwidth (in bins) is no longer a power of two, thereby thwarting
use of radix-2
inverse-FFTs to compute the
time-domain band signals.

To keep the channel bandwidths at powers of two while restricting
aliasing to stop-band energy, the IFFT bands can be widened to
*include* transition bands on either side. That is, the desired
pass-band *plus* the two transition bands span a power-of-two
bins. This results in overlapping channel IFFTs.
Figure 10.38 shows how the example of
Fig.10.34 is modified by
this strategy.

The basic principle of filter-bank band allocation is to enclose each
filter band plus its transition bands within a wider band that is a
power-of-two bins wide.^{11.23} The band should roll off to reach its
stop-band at the edge of the wider encompassing band. It is fine to
have extra space in the wider band, and this may be filled with a
continuation of the enclosed band's stop-band response (or some
tapering of it--since we assume stop-band energy is negligible, the
difference should be inconsequential). The desired bands may overlap
each other by any amount, and may have any desired shape. The
encompassing bands then overlap further to reach the next power of two
(in width) for each overlapping extended band. (See the gammatone and
gammachirp filter banks for examples of heavily overlapping bands in
an audio filter bank [111].)

In this approach, pass-bands of arbitrary width are embedded in overlapping IFFT bands that are a power-of-2 wide. As a result of this flexibility, the frequency-rotation trick of §10.7.7 is no longer needed for real filter banks. Instead, we simply allocate any desired bands between dc and half the sampling rate, and then conjugate-symmetry dictates the rest. In addition to a left-over ``dc-Nyquist'' band, there is a similar residual ``Nyquist-limit'' band (a typically negligible band about half the sampling rate). In other words, since the pass-bands may be any width and the encompassing IFFT bands may overlap by any amount, they do not have to ``pack'' conveniently as power-of-two blocks.

The minimum channel bandwidth is defined as two transition bands plus
one bin (*i.e.*, the minimum pass-band width is zero, corresponding to one
bin, or one spectral sample). For the Dolph-Chebyshev window, the
transition bandwidth is known in closed form [155]. In our
examples, we have a length 127 window with 80 dB stop-band attenuation
in the lowpass prototype [`chebwin(127,80)`], corresponding to
a transition width of 6.35 bins in a length 256 FFT, which was rounded
up to 7 bins in software for simplicity of band allocation.
Therefore, our minimum channel bandwidth is 15 bins (two transition
bands plus one sample for the band center). The next highest power of two
is 16, so that is our minimum encompassing IFFT length for any band.

The dc and Nyquist channels are combined into a single channel
containing the left-over residual filter-bank response consisting of a
low transition down from dc and a high-frequency transition up to the
sampling rate (in the complex-signal case). When `N` is
sufficiently large so that these bands contain no audible energy, they
may be discarded. We include them in all examples here so as to
preserve the (near) perfect reconstruction property of the filter
bank. Thus, the 7-bin dc channel is combined with the 7-bin Nyquist
channel to form a single 16-bin encompassing residual band that may be
discarded in many audio applications (when the initial FFT size is
sufficiently large for the sampling rate used).

In the example of Fig.10.38, the
initial FFT size is 256, and the channel bandwidths (pass-bands only,
excluding transitions), from top to
bottom, are 121, 64, 32, 16, and 8 bins. The top band is reduced by 7
bins to leave a transition band to the sampling rate. Similarly, the
lowest band lies above a transition band consisting of bins 0-6. The
encompassing IFFTs (containing transitions) are lengths 256, 128, 64, 32, 32, for the interior
bands, and a length 32 IFFT handles the dc and Nyquist bands (which
are combined into a single 14-bin band about dc, which occupies 28
bins when the transition bands are appended). Letting [lo,hi] denote a
band by its lower and upper bin limit, the non-overlapping adjacent
pass-band edges in ``spectral samples''^{11.24} of the interior bands are [8, 15],
[16, 31], [32, 63], [64, 127], and [128, 248]; the overlapping
encompassing IFFT band edges are then [1, 32], [9, 40], [25, 88], [57,
184], [1, 256], *i.e.*, they each contain a pass-band and two transition
bands, and have a power-of-2 length. The downsampling factor for each
channel can be computed as the initial FFT size (256) divided by the
IFFT size (
,
,
,
, or
) for the channel.

Figure 10.39 shows the counterpart of Fig.10.35 for this example. In this case, the aliased signal energy comes only from channel-filter stop-bands. For narrow bands, the aliasing is suppressed by at least 80 dB (the side-lobe level of the chosen Dolph-Chebyshev window transform). For bands significantly wider than one bin (the minimum bandwidth in this example is the dc-Nyquist band at 14 bins), the stop-band consists of a sum of shifts of the window-transform side lobes, and these are found to be more than 80 dB down due to cancellation (more than 90 dB down in most bands of this example).

[How to cite this work] [Order a printed hardcopy] [Comment on this page via email]

Copyright ©

Center for Computer Research in Music and Acoustics (CCRMA), Stanford University