From CCRMA Wiki
Everyone in this world (or mostly everyone :P) has a sense of rhythm in the music they listen to. This is the rhythm they dance to; this is the rhythm they sing out while humming the song. So our idea is to somehow enable users to be able to express this rhythm via the manner of beat boxing.
People have made a lot of fun of us (on multiple occasions) while we were completely immersed in beat boxing to our favorite song that came on the radio. This mockery was justified, since in comparison to professional beat boxers, our skills were no where near. So we decided, why not make a new game that helps novices like us learn how to beat box better! And that gave birth to BeatBox Hero!
Based on our premise and motivation, we wanted to build a software that helped users to beat box better and more than that enjoy the whole experience of beat boxing to a song (and not be mocked at!). Deriving inspiration from products such as Tap Tap Revenge and Guitar Hero, we decided to make this a game that people would love playing as well as enhance their beat boxing skills. We experimented with a few ideas and finally settled down to this one.
Once a song is selected the game begins. As shown below, the user sees a stream of balls coming down that symbolize beat sounds. Whenever a ball enters the end zone (the grey shaded region), the user is expected to make the sound corresponding to it. We keep a percentage count of the number of beats hit.
The system has 2 major components to it - the beat detection engine and the graphics visualization.
Beat Detection Engine
The purpose of the engine is to be able to detect which beat has been played when a person beat boxes to a song. There are 2 major problems we dealt with here
- Onset detection: The first problem is to figure out when a beat occurs given a stream of audio input. We solved this by tracking the RMS energy in each buffer. If it crossed an energy threshold, we started recording the buffers as a potential beat. A major challenge was to distinguish between two beats spoken really fast so that there is overlap between the sounds. To detect the onset of the second beat, we used a second threshold that was slightly higher than the first one. Furthermore, since we wanted the product to be real time, we just store a small fraction (1024 samples) of the beat and work with that so that we can play the beat back in real time.
- Beat pattern matching: Given an audio sample of a beat, we solved the problem of detecting which beat it is. We have 4 kinds of beats possible - bass, midtom, snare and hi-hat. We used audio features to be able to classify which one of the 4 a beat is. Since, the software is real time, the amount of information stored is very less and so, complex features could not be used to distinguish between beat sounds. We also wanted cheap (in terms of time) feature detection algorithms. We used simple features such as Zero Crossings, Spectral Centroid and Pitch to distinguish between the beat sounds. First, Zero Crossings is used to distinguish between a bass or midtom (low) vs snare or hihat (high). If zero crossings is low, Pitch is used to distinguish between bass (low) and midtom (high). If zero crossings is high, Spectral Centroid is used to distinguish between snare (low) and hihat (high). Since we have few samples, the spectral centroid isn't very reliable.
The graphics visualization is an Open GL rendering of a UI similar to Tap Tap Revenge (as shown above).
Team of Two
1. Rohan Jain (email@example.com)
2. Ankit Gupta (firstname.lastname@example.org)