Music in Motion is an interactive sound-art installation that uses the motion of balloons through a performance space to modulate and move the 3D placement of synthesized sounds in realtime. MiM utilizes a webcam and a Max patch utilizing the cv.jit Max library to determine the location and color of balloons thrown by participants. The position data is sent from Max to Ableton Live where it is used to artificially pan synthesizers using first-order ambisonics panning (powered by Envelop for Live's Max and Max4Live patches), as well as modulate synth pitches, filter sweeps, and other effects in realtime. Different balloon colors are tied to different notes and timbre. In an ideal installation, MiM is tuned such that participants are clearly aware of how the motion of their balloons changes the sound in the space––a participant could throw a balloon and hear the perceived sound source of their balloon’s instrument move away from them in motion with the balloon.
With Music in Motion, I wanted to explore how users could physically interact with objects in a three-dimensional space to influence the spatial placement of sound in that space. Air-filled balloons seemed like an ideal object to pick as an interaction tool, as they are brightly colored and easy to recognize with a webcam, would move slowly enough when thrown for users to hear the correlation between motion and changing sound, and produce smooth changes in height when floating up or down that could be mapped to parameters such as filter sweeps or pitch (de)tuning.
Over the course of the project, my plans for the scope of the project changed slightly. Initially, I thought that I would use two identical webcams in conjunction to detect depth, but this would have turned out to be both a programming and tuning nightmare, given that small changes to their relative positioning could easily throw off depth calculations. Given my relatively limited experience with computer vision techniques, I decided to use the relative size of the balloons on the video feed to roughly guess depth (or distance from the camera) instead. I also originally planned to write a Python script utilizing the SimpleCV Python library for the CV aspect of my software, but I could not properly install the SimpleCV library on my macOS laptop after several hours of troubleshooting. Additionally, I could not find any way to transmit midi CC data with Python, and I did not want to try sending the data to Max/Ableton through serial messages. Instead, I opted for using the cv.jit Jitter/Max library for my computer vision component, as I already had experience with the library before and knew of a technique for sending data between Max and Ableton through midicc. I wasn't sure what ambisonics panning software/plugin I wanted to use at first, but I ended up deciding to use the Envelop for Live beta software (max4live and max patches) since it integrated easily with my existing Ableton + Max setup.
- MacBook Pro (running previously listed software)
- 4 output audio interface (Komplete Audio 6)
- Logitech wide-angle webcam (mounted 6.5-8ft high using stand)
- 4 loudspeakers (and 4 stands positioned in a ~12’x12’ square configuration)
- 5 balloons (green, blue, purple, red, and yellow)
Installation at Bing
On the first day of the installation at Bing, I had several issues with my chroma-keying color filtering technique, the Envelop for Live software, and Jack. My previous testing with Music in Motion's webcam motion tracking took place in the consistently lit rooms of CCRMA's studios, and I did not anticipate having to adjust for the constantly shifting light source of the setting sun filtering through the trees outside Bing's windows. The occasional shadows cast on the performance space required me to adjust the tuning of each individual color filter every few minutes, or my webcam Max patch would not recognize the balloons as being the correct size (or not recognize them at all). I also ran into a lot of headaches with patching between Ableton, the Envelop server Max patch, and my audio interface using Jack––this was my first time working with Jack on a non-Linux system, and I did not anticipate running into as many crashes and performance hitches on installation day. Only two speakers were outputting audio at one point, even though I had all four speakers connected to my interface and all the correct connections in Jack; I later learned it was because I needed to change the decoder setting from binaural to quad in Envelop. I had done all of my testing with headphones up until this point and assumed Envelop would change decoder mode contextually given the number of outputs it was connected to in Jack.
On the second day of the installation, however, I had fixed the Envelop decoder issue and had my Jack connections running smoothly. Since balloon recognition was still spotty given the setting sun and tree shadows, I had to turn up the chroma-key threshold values for each color and could only run two balloons at once (red and blue) without interference between the instruments. The spatialization and motion tracking of the balloons, however, worked better than I expected, and I could clearly hear the sound source of each balloon track its motion as I tossed it away from me. I regret not being able to run the installation with multiple colors of balloons at once, but I still ended up satisfied with the two balloon setup that I presented (unfortunately, I forgot to record video of the installation on this day).
Binaural Demonstration Video (watch with headphones!)