From CCRMA Wiki

Jump to: navigation, search


Sound Doodle


The main inspiration for my project came from the observation that both light and sound are made up of waves. Just as the wavelength of sound a sound wave affects the pitch that we hear, the wavelength of light affects the color that we see. My goal was to create a kind of “translation system” that would convert visual signals into audio. Specifically, because we hold a large amount of visual information on our computers as digital photos, Sound Doodle is a system for visually exploring the musical elements embedded in digital images.

The key takeaway of the Sound Doodle experience is how all the parts of the composition are represented visually. In Sound Doodle, each digital image is like an instrument, where each pixel is a different note. The color of each pixel is mapped to a note on a major scale. The hue (color, wavelength) of the pixel represents the note letter, the value (lightness vs. darkness) of the pixel corresponds to the octave, and the saturation (brightness vs. paleness) of the pixel corresponds to the volume. Therefore, each image has a unique spatial distribution of notes. When the user plays, he draws on the image, leaving a visual record of the entire song. At the end of the creation process, the final image displays both the score that was composed and the visual instrument upon which it was created.


  • The first step in using Sound Doodle is to load an image (File->Load Image or Ctrl+O)
  • Notes are added by drawing lines on the screen. Longer the line, the longer that note is sustained. The length of a single “unit” is shown on the bottom left corner. The length is always rounded to the nearest whole unit.
  • The duration of a single “unit” can be selected from the “Interval” dropdown on the right-hand-side. Units are shown as fractions of a whole beat.
  • This allows you to create shorter notes without having to draw ridiculously short lines.
    • For instance, a note drawn with interval “1/1” and length of 1 unit will be sustained for exactly 1 beat.
    • However, a line of the same length with the interval “1/2” will be sustained for half a beat
    • The size of the unit is reflected in the width of the line
  • Multiple notes can be played simultaneously by drawing notes in different “tracks” by using different pen colors, selected from the “Tracks” dropdown on the right.
  • After each note is played, regions in the image are highlighted to show notes that are in a chord as, but are different from, the previously drawn note. These suggestions can help guide your composition.
  • Finally, the composition can be played using the “Play” button. Notes on each track are played back-to-back in the order they were drawn, with all tracks played over each other simultaneously
  • The hue (note) and value (octave) windows on the left can help guide your creative process.

System Design

The main program flow is as follows. First the user loads the image, and image analysis is performed to convert the image into HSV space. Then the user interacts with the GUI library to draw on the image. After each stroke, the NoteGenerator gives suggestions for other Notes and creates and image mask to display suggested locations. After all the drawing is done, the notes are generated and sent to the sequencer to be played.

Image Analysis

OpenCV is used to perform the color-space conversions from RGB to HSV. This converted image is used by the NoteGenerator to map pixel values to notes. This conversion also allows the generation of the hue and value preview images.


The Qt Gui library is used to display everything and handle user input. When the user clicks, the main widget keeps track of the total length of the stroke, the starting color of the stroke, as well the selected track and interval settings. When “play” is pressed, this information is used to create Note objects, which store their start time, duration, and pitch. These Note objects are sent to the Sequencer, which is responsible for placing the Notes in order and generating the final samples.

Note Suggestion

The NoteGenerator holds information about scales and the frequencies of each note. When a user completes a stroke, the NoteGenerator takes the previous note and creates a set of suggested notes to follow up, based on the set of chords the previous note may have been a part of.


When notes are generated, they are sent to the sequencer to be placed in order. The Sequencer also keeps track of the current time since the beginning of playback, and it generates audio samples to respond to the RTAudio audio callbacks.


Each Note object contains its own synthesizer, which incrementally generates samples for the Note. Synthesis is done using the STK library, and the synthesizer is essentially a wrapper around an STK Instrument.

Personal tools