From CCRMA Wiki
Jump to: navigation, search


This is the wiki page for Ravi Parikh and Keegan Poppen's Music 256A final project, Fall 09-10.


We wish to extend assignment 2 in order to create a MIDI-controlled vocoder/harmonizer/pitch correction software. The user will be able to play MIDI notes and sing into a mic simultaneously, and the output will be audio that is either pitch corrected or vocoded to the MIDI notes being played, depending on the mode. There will be a GUI to control parameters.


Neither of us are very good singers, and in raw form, our voices are one instrument that we can't use in compositions. Software already exists that vocodes and auto-tunes voices, but we want to have a greater understanding of how this software works at the lowest level. This way, we'll have as much control as possible on how our voices can be processed. Our goal is not to create an Antares clone; rather, we want to cultivate our own sound and use this in future musical creations.

Software Architecture

Our project will be built in C++, leverage the RtAudio and RtMidi libraries in order to give us (significantly more) pain-free real-time audio and midi. Generally speaking, there will be a basic class hierarchy such that we can abstract out the various means of pitch detection and pitch modification that we will be experimenting with throughout the course of the project. In terms of development, this structure allows us to work iteratively on different parts of the system (once there is one in place) so that we can improve and/or add features to the final application. This structure also allows us to more easily implement the interface, as we will be able to change only what we need to change in order to create all the different interface elements necessary for the final project.

The interface for interacting with the application (to allow real-time control of the various parameters in the application) will be implemented in OpenGL.

The control process will essentially be:

application owns: 1 MIDI/audio track object

user selects track object type (vocoder, harmonizer, etc)

audio/MIDI is handled by callback function in the track class

instance variables in track class are set via OpenGL interface

User Experience

Fundamentally, the user interface (outside of the use of a microphone and MIDI controller) allows the user to switch between one of several possible modes of the application, each with its own specific settings, in addition to the existence of some more general parameters. Since we are not constrained by design considerations that exist when creating hardware, we will explore other means of control outside of the usual sliders, knobs, dials, etc. One potential example of this would be to have a 2-d coordinate system where the user can set two interrelated parameters at once. The interface will also allow the user to have some conception of what effect the software (and potentially the individual parameters) is having on the audio signal by visualizing the changes in both the time and frequency domains.


Milestone 1: Closed loop

For milestone 1, we will have a fully-functioning, closed loop system that implements very basic versions (or at least a hello world) of each of the different aspects of our project-- microphone input, midi input, pitch detection, pitch shifting, gui, and real-time output.

Milestone 2: Incremental improvements / Algorithmic development

After we have a fully-functioning system, we will begin to look into different algorithms for each of these pieces. With regard to pitch detection (the most complicated of the pieces) we will look into both time- and frequency-domain pitch correction algorithms (including phase-vocoding and PSOLA). We will also investigate different potential interfaces for allowing real-time control of the different parameters by the user.

Final Product:

Between milestone 2 and the final product, we will finalize the user interface of the application, in addition to engaging in the general hackery incumbent with finishing a final project.