Experiment 1 (Final) | Drum Kit
POV: You are at your friend's house and they just got a drum kit and are letting you try it out. I made a silly little Drum Kit thingy that lets you click on different parts of a drum kit image and play the appropriate sound. I used a library of Drum Kit sounds I found online, and matched each of the kinds of drums (hi-hat, tom, snare, kick, etc.) to the respective pixels in the image by training a Wekinator classifier that takes mouse positional input (on-click) and classifies it into 1 of 8 types of drums. I trained each drum on ~20 samples. Given more time, I thought it would have been cool to make the OSC input function through javascript onClick events, and maybe have this on a tablet with touch screen so I could really do like real-time drumming with finger taps, since it's pretty difficult to actually click the drum you want at a fast pace. But overall had fun with it!
Here's a demo of the current system:
Experiment 2 (Final) | Guitar Strumming
Since the previous two experiments were largely based on classification of input events, I wanted to try using a more continuous space of input events. For this, I designed a little guitar tool that lets you click and drag to simulate "strumming" a string on a guitar, and play the appropriate sound. For this, I recorded the actual sounds on my guitar of me playing each string individually, and then strumming all the strings front-to-back and also back-to-front. The result was a fun little program that reminded me of what it felt like the first time I picked up a guitar and was just aimlessly plucking all the strings and seeing what they sounded like. I was surprised at how well the DTW worked (with just 10 sample gestures per string), but definitely think it would have been way cooler to make this more performative (i.e. have it play some real chords, and have the strings be individual notes of those chords), so I could play some real melodies.
Here's a demo of the current system:
Experiment 3 (Milestone) | Personalized Bad Bunny DJ
I made a (very bad) DJ that plays Bad Bunny songs depending on my mood. To implement this, I used a library that I found for facial emotion recognition in Processing. In Wekinator, I set the input to be the index of the emotion (from ["happiness", "sadness", "surprise", "neutral"]) as classified from webcam images of my face. I made the output a single variable as well, representing 1 of 4 songs that I selected ("Enseñame Bailar", "Moscow Mule", "Neverita", "Titi Me Preguntó") which I thought would pair well with the different moods. In Chuck, I implemented a script that plays the appropriate audio file based on the detected emotion. After training the system on 100 paired examples for each emotion/song, I deployed it. Overall, it was a fun process and I was surprised at how well it worked for being so simple. I definitely want to explore more complex input/output spaces in the other experiments.
Here's a demo of the current system:
Reflection
I really enjoyed this assignment! I feel like up until now, most of our projects have been focusing on how to use features of Chuck to build systems, but Wekinator opened up a lot of new possibilities for how to make these systems much more interactive which was cool. I was pleasantly surprised by the Wekinator interface and how easy and intuitive it was to train on whatever inputs / outputs you have. I particularly liked how Wekinator supported a very flexible interface for input features / outputs, and the most fun aspect was getting to play around with Dynamic Time Warping which gave a lot of fine-grained control over the input space. The fact that it was so modular made it easy to experiment a lot in terms of both input and output, and easily compose them together. This was actually the most interesting aspect of the assignment to me, since I'm used to using frameworks like Pytorch/Tensorflow for any kind of learning-based applications, and have always felt that the interfaces for training these models are not great for these kinds of interactive settings where you just want to prototype things on the fly and quickly see how they would work. It would have been a pain to have to collect an offline dataset of input OSC data and output classification or regression labels, wait an hour or two for a neural network to train, and then try it out, only to find a serious bug :). In that sense, I really appreciated how quickly we were able to iterate on proof of concepts with Wekinator. Given more time, I would have liked to dive more into how Wekinator works behind the scenes, and see what kinds of models it is fitting to the data. We got a glimpse of this a little bit with playing around with things like the match threshold for dynamic time warping, but I'm curious what the algorithms being used are and how easy they are to fine-tune. I also feel that my projects were fun and turned out similar to how I envisioned them, but would've liked to make them more performative in some way. Especially for the guitar/drum kit, I felt like it was a little bit tacky and would've liked those applications to feel more like a performance than a functional demo.
Code / Acknowledgements
The code for running experiments 1-3 is here.
I used ChatGPT to figure out how to load images in Processing and handle some of the mouse click events. I used this library for Experiment 3, and this website to download Drum Kit sounds.