Next  |  Prev  |  Up  |  Top  |  JOS Index  |  JOS Pubs  |  JOS Home  |  Search

Transforming

Two different types of linear transforms are used in the coder. The first one is the Modified Discrete Cosine Transform, which is used to represent the audio blocks in the frequency domain, so that the masking threshold can be used directly for quantization. The second is the Karhunen Lòeve Transform (KLT), which is used to efficiently encode blocks of MDCT coefficients, where the coefficient blocks correspond closely to critical bands. The KLT is used because of its optimality in sense of energy compaction.

One might argue that a linear transform of a linear transform is just another linear transform, and thus I am just wasting time. In this case, though, the frequency basis is needed to quantize the data according to the masking threshold, and thus the KLT cannot immediately be used without the MDCT step.

A linear transform can be thought of in many different ways. The MDCT here is best viewed as a subband filter bank, while the KLT is viewed as a change of basis in an $n$-dimensional space, where $n$ is the length of the transform block.



Subsections
Next  |  Prev  |  Up  |  Top  |  JOS Index  |  JOS Pubs  |  JOS Home  |  Search

Download bosse.pdf

``An Experimental High Fidelity Perceptual Audio Coder'', by Bosse Lincoln<bosse@ccrma.stanford.edu>, (Final Project, Music 420, Winter '97-'98).
Copyright © 2006-01-03 by Bosse Lincoln<bosse@ccrma.stanford.edu>
Center for Computer Research in Music and Acoustics (CCRMA),   Stanford University
CCRMA  [Automatic-links disclaimer]