Music 220C (Spring 2006)
Research Seminar in Computer-Generated Music
James Lin


Project: IVACS - Interactive Video/Audio Control system

Description:
The idea behind this project deals with creating an interactive interface that can be used to easily control audio and video functions. The initial concept involves using real-time video tracking to pick up certain signals provided by an individual; these signals in turn would trigger some sort of function in a video or audio track that is being outputted. Once this is accomplished, the goal is to explore the various additional control and function possibilities of the project.

Original Project Proposal: Reactive Music Generation for Video Images
Description:
My original idea involved generating music that responds and reacts to video images. The hope was to create a process that could analyze a video feed, obtain data about the video such as RGB colorization, and use those parameters to output some music in MIDI. An additional goal was to be able to have this happen in near real-time so that the music would be generated as a viewer was watching the video that was being analyzed.

---------------------------------

One of the inspirations behind this project is a program created by Lauri Gröhn called Synestesia Software Music (website: http://www.synestesia.com/). What this program does is generate music from pictures (Gröhn claims this process only takes 5 seconds). A brief explanation about how it works can be found here, but basically the software uses a complex picture filtering algorithm to create enough parameters for music generation. For example, a picture that starts out like this:


is converted to the image seen
on the right:

which is the "score" that the Synestesia
software uses to generate music.

Examples
There are many examples of the music generated by the Synestesia software on the author's website. I've highlighted some of the examples below. I've included both music and MIDI examples.

Music Example [mp3 link] (image below)

MIDI Example [midi link] (image below)

For my project, I decided to use Max/MSP, Jitter, and a Max object called Cyclops (all available at http://www.cycling74.com). Cyclops is used to handle live video input; in my case, I used it to receive and analyze video data coming from a basic webcam. The input video was split into a number of zones (defined by the user) and each zone was then be set up to analyzed a variety of ways by Cyclops. For my project, I had 9 zones mapped out, with each zone set up to provide me a grey value (from 0-255) of the current live video image in that zone. This was done so that when a particular zone reached a zero grey value (which means the entire zone was black), it would trigger some function. This is where Jitter comes in.

I used Jitter for most of my video/audio processing and manipulation. I set it up to play the actual video track that would be controlled by my Cyclops triggers. So just to clarify, there were two separate video windows: the first was the live video window which was handled by Cyclops, the second was the video track window which was handled by Jitter. In my project, I experimented with various music videos to test out my program.

My project involved two separate demos. The first one dealt with only one music video being manipulated. The nine zones set up in Cyclops provided the following functionality:
1) pause
2) start
3) muted volume
4) half volume
5) full volume
6) fast forward
7) rewind
8) skip back 5 seconds
9) skip ahead 5 seconds.

The second demo dealt with being back to mix two music videos together like a DJ would. The functionalities in this program involved:
1) pause
2) start
3) fade from video 1 to video 2
4) fade from video 2 to video 1
5) set fade level to 0 seconds
6) set fade level to 2 seconds
7) activate effect 1
8) activate effect 2
9) remove all video effects.

Since Jitter has extensive video manipulation capabilities, the possibilities for the two effects slots are numerous. For my demo, I used Jitter’s scissors patch (which would split the video into four smaller images) and the color table patch (which affected the colorization of the video). Jitter also provided the functionality to be able to fade from video to another. After a lot of a tweaking, I was able to get my Cyclops functions and my Jitter functions to work together pretty well. Pretty much, I set it up so that I would trigger something in Cyclops, and that would route to one of my Jitter functions, thus affecting the music video in some fashion.

To demo my program, I used black gloves in order to be able to trigger the zones being analyzed by Cyclops. So once the webcam was started up and the music video started playing, I would wave my hands around, covering specific zones to activate specific function in the music video. If someone were to get good at it, the actual movements one makes to trigger certain zones could be a performance in itself.