Next  |  Prev  |  Up  |  Top  |  JOS Index  |  JOS Pubs  |  JOS Home  |  Search

Quantization using the Masking Threshold

The main reason for using a psychoacoustic model for audio compression is that given a masking threshold $M_t(f)$, the amplitude at that frequency may be quantized with a step size proportional to $M_t(f)$. The quantization can be seen as introduction of noise with power proportional to $M_t(f)$:

\mathcal{F}_Q(f) = \mathcal{F}(f) + noise(f)

The quantization error can then easily be adjusted to be lower than the masking threshold, and thus become inaudible.

In the implementation of the coder, the psychoacoustic model is adjusted using only a quantizer with step size $M_t(f)$ on every transform coefficient. This way, the coding is kept separate from the psychoacoustic model. Thus, when I start to design the coder, I can be certain to get perceptually perfect data independent of coding method.

Next  |  Prev  |  Up  |  Top  |  JOS Index  |  JOS Pubs  |  JOS Home  |  Search

Download bosse.pdf

``An Experimental High Fidelity Perceptual Audio Coder'', by Bosse Lincoln<>, (Final Project, Music 420, Winter '97-'98).
Copyright © 2006-01-03 by Bosse Lincoln<>
Center for Computer Research in Music and Acoustics (CCRMA),   Stanford University
CCRMA  [Automatic-links disclaimer]