Next  |  Prev  |  Top  |  Index  |  JOS Index  |  JOS Pubs  |  JOS Home  |  Search

Equivalent Rectangular Bandwidth

Moore and Glasberg [19] have revised Zwicker's loudness model to better explain (1) how equal-loudness contours change as a function of level, (2) why loudness remains constant as the bandwidth of a fixed-intensity sound increases up to the critical bandwidth, and (3) the loudness of partially masked sounds. The modification that is relevant here is the replacement of the Bark scale by the equivalent rectangular bandwidth (ERB) scale. The ERB of the auditory filter is assumed to be closely related to the critical bandwidth, but it is measured using the notched-noise method [27,28,31,22,5] rather than on classical masking experiments involving a narrowband masker and probe tone [41,42,39]. As a result, the ERB is said not to be affected by the detection of beats or intermodulation products between the signal and masker. Since this scale is defined analytically, it is also more smoothly behaved than the Bark scale data.

Figure 11: Bark critical bandwidth and equivalent rectangular bandwidth as a function of frequency. Also plotted is the classical rule of thumb that a critical band is 100 Hz wide for center frequencies below 500 Hz, and 20% of the center frequency above 500 Hz. Also plotted is the emprically determined formula, CB bandwidth in Hz $\approx 94+71f^{3/2}$, with $f$ in kHz [37]. The ERBs are computed from Eq. (28), and the Bark CB bandwidths were computed by differencing the band-edge frequencies listed in Section 3, plotting each difference over its corresponding band center (also listed in Section 3).
\includegraphics[scale=0.8]{eps/erbbark}

At moderate sound levels, the ERB in Hz is defined by [19]

\begin{displaymath}
\mbox{ERB}(f) = 0.108 f + 24.7
\end{displaymath} (28)

where $f$ is center-frequency in Hz, normally in the range 100 Hz to 10kHz. The ERB is generally narrower than the classical critical bandwidth (CB), being about $11$% of center frequency at high frequencies, and leveling off to about $25$ Hz at low frequencies. The classical CB, on the other hand, is approximately $20$% of center frequency, leveling off to $100$ Hz below $500$ Hz. An overlay of ERB and CB bandwidths is shown in Fig.11. Also shown is the approximate classical CB bandwidth, as well as a more accurate analytical expression for Bark bandwidth vs. Hz [1]. Finally, note that the frequency interval [$400$ Hz, $6.5$ kHz] corresponds to good agreement between the psychophysical ERB and the directly physical audio filter bandwidths defined in terms of place along the basilar membrane [6, p. 2601].

Figure 12: Bark and ERB frequency warpings for a sampling rate of $31$ kHz. a) Linear input frequency scale. b) Log input frequency scale. Note that sampling is uniform across the vertical axis (corresponding to the desired audio frequency scale). As a result, the plotted samples align horizontally rather than vertically.
\includegraphics[scale=0.8]{eps/erbbarkm}

The ERB scale is defined as the number of ERBs below each frequency [19]:

\begin{displaymath}
\mbox{ERBS}(f) = 21.4 \log_{10}(0.00437 f + 1)
\end{displaymath} (29)

for $f$ in Hz. An overlay of the normalized Bark and ERB frequency warpings is shown in Fig.12. The ERB warping is determined by scaling the inverse of Eq. (29), evaluated along a uniform frequency grid from zero to the number of ERBs at half the sampling rate, so that dc maps to zero and half the sampling rate maps to $\pi$.

Proceeding in the same manner as for the Bark-scale case, allpass coefficients giving a best approximation to the ERB-scale warping were computed for sampling rates near twice the Bark band edge frequencies (chosen to facilitate comparison between the ERB and Bark cases). The resulting optimal map coefficients are shown in Fig.13. The allpass parameter increases with increasing sampling rate, as in the Bark-scale case, but it covers a significantly narrower range, as a comparison with Fig.4 shows. Also, the Chebyshev solution is now systematically larger than the least-squares solutions, and the least-squares and weighted equation-error cases are no longer essentially identical. The fact that the arctangent formula is optimized for the Chebyshev case is much more evident in the error plot of Fig.13b than it was in Fig.4b for the Bark warping parameter.

Figure 13: a) Optimal allpass coefficients $\rho ^*$ for the ERB case, plotted as a function of sampling rate $f_s$. Also shown is the arctangent approximation. b) Same as a) with the arctangent formula subtracted out.
\includegraphics[scale=0.8]{eps/pfserb}

Figure 14: Root-mean-square and peak frequency-mapping errors (conformal map minus ERB) versus sampling rate for Chebyshev, least squares, weighted equation-error, and arctangent optimal maps. The rms errors are nearly coincident along the lower line, while the peak errors form an upper group well above the rms errors.
\includegraphics[scale=0.8]{eps/rmspkerrerb}

The peak and rms mapping errors are plotted versus sampling rate in Fig.14. Compare these results for the ERB scale with those for the Bark scale in Fig.5. The ERB map errors are plotted in Barks to facilitate comparison. The rms error of the conformal map fit to the ERB scale increases nearly linearly with log-sampling-rate. The ERB-scale error increases very smoothly with frequency while the Bark-scale error is non-monotonic (see Fig.5). The smoother behavior of the ERB errors appears due in part to the fact that the ERB scale is defined analytically while the Bark scale is defined more directly in terms of experimental data: The Bark-scale fit is so good as to be within experimental deviation, while the ERB-scale fit has a much larger systematic error component.

The peak error in Fig.14 also grows close to linearly on a log-frequency scale and is similarly two to three times the Bark-scale errors of Fig.5.

Figure 15: ERB frequency mapping errors versus frequency for the sampling rate $31$ kHz.
\includegraphics[scale=0.8]{eps/fmeerb}

The frequency mapping errors are plotted versus frequency in Fig.15 for a sampling rate of $31$ kHz. Unlike the Bark-scale case in Fig.6, there is now a visible difference between the weighted equation-error and optimal least-squares mappings for the ERB scale. The figure shows also that the peak error when warping to an ERB scale is about three times larger than the peak error when warping to the Bark scale, growing from 0.64 Barks to 1.9 Barks. The locations of the peak errors are also at lower frequencies (moving from 1.3 and 8.8 kHz in the Bark-scale case to 0.7 and 8.2 kHz in the ERB-scale case).



Subsections
Next  |  Prev  |  Top  |  Index  |  JOS Index  |  JOS Pubs  |  JOS Home  |  Search

Download bbt.pdf

``The Bark and ERB Bilinear Transforms'', by Julius O. Smith III and Jonathan S. Abel, preprint of version accepted for publication in the IEEE Transactions on Speech and Audio Processing, December, 1999.
Copyright © 2007-05-10 by Julius O. Smith III and Jonathan S. Abel
Center for Computer Research in Music and Acoustics (CCRMA),   Stanford University
CCRMA  [Automatic-links disclaimer]