All of the following samples can be found in wav format on the web, at http://ccrma.stanford.edu/~bosse/. Note that no rate controlling module is developed, and thus bitrates vary a lot with the type of signal. From the values below, one can deduce that a fair perceptual lossless coding can be achieved at about 128 kbit/second for stereo data and 75 kbit/second for mono.
|Audio Stream||Bits/s||M/S||Apparent artifacts|
|music.wav||106 kb/s||S||Especially triangle and castanets|
|castanets.wav||103 kb/s||S||Very audible preecho|
|oasis.wav||118 kb/s||S||The ``s'' in ``sunday'|
|oasis.wav||89 kb/s||S||Easy to detect|
|jacob.wav||83 kb/s||S||Easy to detect|
M/S means mono/stereo. The low bitrate signals are coded with a masking threshold multiplied by a factor and respectively.
The encoder described in this report is apparently rather undeveloped. To improve the coder, I would like to add some kind of transient coding, for example using wavelets. Transform-wavelet hybrid coders (see e.g ) has become more popular and show good results. Also, an adaptive prediction in either the time- or transform domain would decrease bitrate in stationary signals (although some of this is exploited by the KLT).
This project did not result in a coder with many new features, but in some experience and knowledege for me in the field of high fidelity perceptual audio coders.