Introduction

Introduction

High quality audio in, e.g., CD or DAT players require large amounts of data. A CD stream consists of 44100 16-bit samples per channel, per second, which corresponds to 1.4 Mbit/s in stereo. Audio at that bitrate contains a lot of redundancies, which can be exploited in a lossless coder to get the bitrate down to about half of that. The human auditory system, though has many limitations, and thus lossy coder which exploits those properties, can be made much more efficient -- 10 to 12 times less bits can often be used without perceptual loss. The audio coding community uses this extensively, and has come much further in this field than for example the video coders.

In this project, I have read papers in current perceptual audio coding, and with concepts from those (and some new ideas), I have implemented an experimental audio transform coder. The coder is not intended to be state-of-the-art but rather a tool for me to learn the difficulties that arise in a coder of this kind. The report is structured in the following way.

2 Current Implementations describes some of the well-known implementations and standards that exist, and some ideas from those.
3 Human Audio Perception: Masking describes the masking properties of the human auditory system, and the implemented model of this in the coder.
4 Audio Coding goes through and motivates the quantization and bit coding used in the coder.
5 Results and Conclusions shows bitrates and ``quality'' of some encoded audio clips. The audio clips are presented on the web.

Download bosse.pdf

``An Experimental High Fidelity Perceptual Audio Coder'', by Bosse Lincoln<bosse@ccrma.stanford.edu>, (Final Project, Music 420, Winter '97-'98).
Copyright © 2006-01-03 by Bosse Lincoln<bosse@ccrma.stanford.edu>
Center for Computer Research in Music and Acoustics (CCRMA), Stanford University
[Automatic-links disclaimer]