High quality audio in, e.g., CD or DAT players require large amounts of data. A CD stream consists of 44100 16-bit samples per channel, per second, which corresponds to 1.4 Mbit/s in stereo. Audio at that bitrate contains a lot of redundancies, which can be exploited in a lossless coder to get the bitrate down to about half of that. The human auditory system, though has many limitations, and thus lossy coder which exploits those properties, can be made much more efficient -- 10 to 12 times less bits can often be used without perceptual loss. The audio coding community uses this extensively, and has come much further in this field than for example the video coders.
In this project, I have read papers in current perceptual audio coding, and with concepts from those (and some new ideas), I have implemented an experimental audio transform coder. The coder is not intended to be state-of-the-art but rather a tool for me to learn the difficulties that arise in a coder of this kind. The report is structured in the following way.