From CCRMA Wiki
Revision as of 18:31, 10 November 2009 by Kmontag (Talk | contribs) (Milestones)

Jump to: navigation, search


Kevin Montag's Music 256A Final Project Proposal, Fall 2009


The vision for the project is to make an interface which pulls sonic "qualities" from a collection of sounds, and applies them to new sounds. A user might, for example, like the shimmer of a particular album, or the darkness of a particular genre, and wish to apply these qualities to a piece of their own.

I see the finished product more as an instrument than an audio plugin - the user should be able to make everything from subtle changes to a recorded piece, to complete distortions of a sound sample, but in a way that is sonically intuitive without too much of a learning curve.


The user will specify collections of sound files to be used as "seeds" for the audio transformations. Each collection will show up as an icon in the main window of the interface, and the user can click on the icon to edit the collection (add and remove sounds from it), or click somewhere else to add a new collection. These collections can be saved and loaded.

The program will (hopefully) be JACK-aware; for each instance of the program, the user will choose a single input to which the transformation will be applied, and a single output to which it will be sent.

The main window will consist of one section containing the available collections, and another section containing the "active" collections. The user drags collection icons in and out of the active space, and then clicks on an icon in the active space to specify the ways in which that collection should be used to affect the sound. When the user clicks on an active collection, I'm envisioning a set of sliders which can be used to say how much each particular audio parameter (shimmer, etc) should be "influenced" by that collection.


Sound qualities will be applied by taking short-time FFTs of the incoming signal, and applying transformations that make each FFT more closely "match" the specified collection with respect to some particular quality of the sound. The matching will be performed using an algorithm that I'll be designing as part of my CS229 final project.


The first milestone is to decide which sound qualities should be available to be "extracted" from collections of existing sounds. Brightness/darkness will be one, "punchiness" of drum hits will be another, and....

The second milestone is to get a user interface up and running, that feels intuitive to use.

The third milestone is to make the program JACK-aware.

The fourth milestone is to link the user interface with the machine-learning work I'll be doing in 229, and allow transformations to be applied in real time.