Next |
Prev |
Up |
Top
|
Index |
JOS Index |
JOS Pubs |
JOS Home |
Search
The purpose of a loudness spectrogram is to display some
psychoacoustic model of loudness versus time and frequency.
Instead of specifying FFT window length and type, one specifies
conditions of presentation, such as physical amplitude level
in dB SPL, angle of arrival at the ears, etc. By default, it can be
assumed that the signal is presented to both ears equally, and the
listening level can be normalized to a ``comfortable'' value such as
70 dB SPL.8.6
A time-varying model of loudness perception has been developed by
Moore and Glasberg et
al. [87,182,88].
A loudness spectrogram based on this work may consist of the following
processing steps:
- Compute a multiresolution STFT (MRSTFT) which
approximates the frequency-dependent frequency and time resolution of
the ear. Several FFTs of different lengths may be combined in such a
way that time resolution is higher at high frequencies, and frequency
resolution is higher at low frequencies, like in the ear. In each
FFT, the frequency resolution must be greater than or equal to that of
the ear in the frequency band it covers. (Even ``much greater'' is ok,
since the resolution will be reduced down to what it should be by
smoothing in Step 2.)
- Form the excitation pattern from the MRSTFT by
resampling the FFTs of the previous step using interpolation kernels
shaped like auditory filters. The new spectral sampling intervals
should be proportional to the width of a critical band of
hearing at each frequency. The shape of each interpolation kernel
(auditory filter) should change with amplitude level as well as center
frequency [87]. This step effectively converts
the uniform filter bank of the FFT to an auditory filter
bank.8.7
- Compute the specific loudness from the excitation
pattern for each frame. This step implements a compressive
nonlinearity which depends on the frequency and level of the
excitation pattern
[182].
The specific loudness can be interpreted as loudness per ERB.
- If desired, the instantaneous loudness can be computed
as the the sum of the specific loudness over all frequency samples at
a fixed time. Similarly, short- and long-term time-varying loudness
estimates can be computed as lowpass-filterings of the instantaneous
loudness over time [88].
The specific loudness gives a useful definition of the
``loudness spectrogram.'' However, one might well prefer to filter it
across the time dimension in the same manner that instantaneous
loudness is filtered to produce short- and long-term loudness
estimates versus time and frequency.
Subsections
Next |
Prev |
Up |
Top
|
Index |
JOS Index |
JOS Pubs |
JOS Home |
Search
[How to cite this work] [Order a printed hardcopy] [Comment on this page via email]