For this project, I experiment with audio features for genre classification and use ChucK and audio classification techniques to build an audio mosaic tool. I use a real-time audio classifier to build an audio mosaic tool that will map input sound features to the most similar feature extracted from a library of sounds. This is based off the CS 470 Programming Project 2 assignment.
I experimented with all the various available audio features in ChucK. I evaluated performance based on 5-fold cross validation. The random classification performance was 0.1. Of the features that produced a single value (ZeroX, Kurtosis, Flux, Centroid, RMS, RollOff), I found that the ZeroX feature alone led to the worst performance (an average score of 0.1013) that was barely above random. RollOff led to the best performance with a score average of 0.1925. Similarly, RMS also performed well (0.1757). Kurtosis (0.1435), Flux (0.1574), and Centroid (0.1615) all had lower classification performance.
I also experimented with multi-dimensional features (Chroma, SFM, MFCCs) as well as a combination of single and multi-dimensional features. I found that MFCCs alone lead to an high classification performance. I tuned the MFCC feature parameters and was able to achieve a classification performance of 0.3827. This was using 13 MFCC coefficients and 15 mel filters. Neither Chroma nor SFM alone were able to perform as well. After testing various combinations of features, I was able to achieve the highest classifcation performance of 0.4147 using 16 features (Centroid, Flux, RMS, and 13 MFCC coefficients).
For my audio mosaic tool, users can play preset drum sounds (kick, snare, hi-hat, percussion) and other sound effects. The tool will search in a library of jazz sounds from the instruments of trumpet, saxophone, piano, and bass for the most similar sound and play that similar sound along with drum hit sound. The purpose of this is to create a fun jazz improv tool.
For this musical mosaic, I used audio sources from Analog Supplies for the kick, snare, and hi-hat sounds. I also used the percussion and sound effects from Venus Theory's Arcturus. I used samples from YouTube for the jazz instrument sources. For the code, I based by implementation off of the audio mosaic starter code that we were provided.
Creating the musical mosaic was highly enjoyable, and it was interesting to have to work around the limits of a similarity search system. When I started building the tool, I knew I wanted the tool to be interactive, but I didn’t want to be limited to only taking in microphone audio as input. As a result, I experimented with different ways to take in audio files and input sources. I realized that I could map different keyboard keys to different sounds and play the mosaic like and instrument. I felt that this gave me greater control over the sounds that I wanted to produce. I test out various sounds from simple one-shots to more complex melodic loops. In the end, I decided to keep it simple and curated a selection of kicks, snares, hi-hats, percussion instruments and a few synth sound effects.
I also spent quite some time trying to figure out how I wanted the system to sound. I tried various sound sources from lyrical songs to electronic music to classical music. One limitation I found was that I had difficulty controlling the exact output sound. I attempted to vary the various sound feature parameters, but none of these sound sources sounded particularly satisfying to me. They didn’t really sound musical or make much sense. I tested speech as a sound source, and this produced interesting and fun results that was entertaining to play with. I finally testing the tool with using a single piano instrumental as output and found much more success with this. It sounded a bit jazzy, so from there, I added in trumpet, saxophone, and bass tracks, which led to the final product.
For how I wanted the tool the look, I wanted to focus on the audio experience but also make it somewhat visually interesting so I created a display where the characters of the keyboard is outputted so that users can have some fun with writing words or creating patterns with the characters. Overall, I had a blast experimenting with and tuning this system.
For my audio mosaic tool, users can play preset audio sounds (drums, synth, sound effects, etc.) and the tool will search in a library of sounds features for the most similar sound and play the similar sound along with the output sound. For this milestone, I tested various types of inputs such as the mic and playing from a sound file. I also tested various songs for the library of sounds and I found that the music by Amon Tobin was suitable to create the effects I was searching for. I experimented with different combinations of both the input sounds to use and the database of songs. In the end, I wanted to create an interactive music player with sounds that users can use the keyboard to control.
Usage: Press keys - to play various drum kit instruments. Press keys [a]-[z] to play various sound samples. Press [Enter] to play a song and [Space] to stop the song.
For this milestone, I used audio sources from Noiiz for sound samples. I used the songs "Surge", "Bedtime Stories", and "Lost and Found" by Amon Tobin. For the code, I based by implementation off of the audio mosaic starter code that we were provided.