Adam @ CCRMA - 256a hw3 (SoundPrism)

Downloads

SoundPrism-0.1.1.tar.gz - tarball containing code, makefile, readme, and executable
SoundPrism.app-0.1.1.tar.gz - executable only

SoundPrism

Summary

SoundPrism is a GLUT-based audio visializer. It shows spectrum, waterfall, scrolling waveform, and oscilloscope. Any combination of these views can be shown or hidden using the number keys (1-4). A slider is provided to alter the logarithmic spacing of spectral components (for emphasis of lower frequencies), and a slider can control the length of audio being visualized as a scrolling waveform. There is also a mode called "deluge" which expands the waterfall display to full-frame width. Two textual displays show the result of pitch detection by auto-correlation (using a Marsyas algorithm), and the frequency of the spectral bin with the highest magnitude.

Strategy

SoundPrism leverages the core parts of the signal processing framework created for assignments 1 and 2 in order to get audio i/o via RtAudio. Other than that (and one other minor addition, described below), the design strategy was to code all required/desired features by "brute-force". The result is a fairly functional program with an extremely ugly code base. The choice to design in this manner is easily justified: (1) I have never before worked with OpenGL at this level, and (2) changes/augmentations may as well alter the code fundamentally.

Challenges

The main technical hurdle to overcome in this assignment was aligning the desired geometries and orientations with what I could actually realize with OpenGL. The development of this program was a kind of mutation--most of the core visualization features were implemented within the first few hours of starting the project, but remaining development time (apprx. 30 hours?) was spent going through bouts of trial and error in an attempt to get it to look good. I can't say I'm displeased with the results :). There is a charming, sort of ghetto/lo-fi charm that just kind of falls out of OpenGL tinkerings.

The other technical challenge involved buffering the incoming audio to be displayed as a scrolling waveform. This was implemented as an AudioClient in my signal processing framework from assignment 1. The new entity, SampleAccumulator, acts as a sort of 'holding tank' for incoming sample data. When the graphics system needs to render a frame, the samples are taken from the accumulator and drawn. One subtlety of this sub-system is that in most cases there will be more sample points to be displayed than screen pixels. Therefore, the accumulator needs to know how many samples are going to be considered for each pixel, and to find the minimum and maximum values for those samples. The graphics system acquires these values are peak-pairs, and renders them as vertical lines.