As part of Music 220C at Stanford’s CCRMA, I intend to explore the creative potential of frequency-domain quantization/dequantization. This follows on and was inspired by the work I did this past winter in Marina Bosi’s Perceptual Audio Coding course.
However, whereas the goal of that class was to teach students how to develop coders which achieve perceptual transparency at relatively low data rates, I approach the same or similar algorithms from the opposite perspective. I’m not concerned with data rate compression at all, and I’m interested in generating as much distortion as possible with an end goal of crafting an electroacoustic piece showcasing the various phenomena which arise.
At this point, I have code which takes 16-bit PCM files (i.e. ordinary .wav files) and encodes the MDCT-derived frequency components via uniform quantization. The window length for the time-to-frequency transform is adjustable, as is the bit depth of the quantization. (I have so far found that 8 or 12 bit uniform quantization yields significant and interesting spectral distortion, depending on the input audio. More on that later.)
Additionally, I have some simple modifications which allow me to (a) linearly ramp the window size, and (b) smoothly cycle between window sizes.
I’d like to implement one other feature. I’ve noticed that the quantization tends to emphasize certain frequencies in the input audio. I believe I could modify the uniform quantization by scaling the input to the quantizer – that is, I would take the frequency values which are about to be quantized, and scale them by some function. After dequantization, I reverse the scaling operation. Thus, I’d change the uniform quantization to something that’s quite adjustable in just two extra steps. I suspect that I’d be able to “tune” the spectral response to achieve musical results. (I should note that this scaling process is also an approach that Marina Bosi mentions in her book as a way to improve uniform quantization.)
I have included some initial sound files demonstrating this technique. Note that the streaming audio might only work on Chrome.
Speech (and applause) - single window length:
Speech (and applause) - ramping window length: