Dong In Lee

 

220C Project - Query by Humming System

 

 

AIM

This project aims to implement query by humming system. The program will get real time voice input. Then, it will show relevant song's name. I will research about computational auditory scene analysis based on papers of Guy Brown, Daniel Ellis, and so on.

 

WEEK1

Done: Determined project subject

To do: Study about similar tool (Praat)
           Read about previous research papers on pitch detection algorithms

WEEK2

Done: Skimmed over Computational Auditory Scene Analysis edited by DeLiang Wang and J. Brown

To do: Select one method for pitch detection

WEEK3

Done: Read through Alain de Cheveigne & Hideki Kawahara's paper - YIN, a fundamental frequency estimator for speech and music

To do: Implement this method on Matlab
           Verify if FFTW can be used on Microsoft Visual Studio .NET

WEEK4

Done: Implemented basic pitch detection function on Matlab

To do: Determine pitch range.
           - Especially lower bound.
What should be the lowest note of the transcript?
           Think about extension. I have more time. =)
           - Query by humming system
           - Multiple pitch detection for auto transcription

WEEK5

Done: Implemented a function that calculates the distance between two string.
           - Worked for short strings. But the running time increased with longer strings.
           Read about papers on Query by humming system.
           - One of them (Note interval) is similar to the algorithm I implemented above.
           Tried to use Stk library on Visual Studio .NET 2003 environment for implementing real time input.
           - I spent A LOT OF time to find out how to use them. Somebody help me!

To do: Find someone who knows about Stk library porting to windows ASAP.
           Choose mapping scheme from input sound stream to discrete note representation.
           Determine the format of DB records and Primary key.

WEEK6

Done: Improved the functionality of string match algorithm.
           - Couldn't use the algorithm from <Theme Finder> because it's not a noise robust algorithm.
           - Used dynamic programming. O(mn) guaranteed. Last time, there existed redundant recursive calls. That's why you saw sluggish running time execution. =)
           Solved the problem with using Stk in VS.NET.
           - Gary gave me some tips. Thanks!
           Implemented pitch to note algorithm.
           - Need some experiments. Tricky thing was to set the threshold value for transition from note to note.
           Determined mapping scheme from input sound to discrete note representation.
           - I will use note's interval, not pitch itself, since people may sing in a key different from the original key.

             - Ex) C C G G A A G => 0 7 0 2 0 -2 => 0 g 0 b 0 B

To do: Implement GUI.
           Merge these things together.
           Make sample Database.           

WEEK7

Done: Made sample Database
           - I will use free midi ringtone files on the web.

           - I made a matlab code for converting from midi note to valid string format.
           - If there is any tune you'd like to have played, just tell me. I will add them to the database. =)
           Solved blocking problem related with real time input on GUI.
           - Used Stk callback function. Now users can adjust the recording time with Button. But I will set the limit:10 sec

To do: Complete and embellish GUI.

WEEK8

Done: Tested with GUI version
           - There seemed to be bugs when I converted from console version to GUI version.
Pitch detection is sometimes unreliable.
           - Callback function is very sensitive. Maybe I will have to spend more time on that.
To do: Make alpha version application.           

WEEK9

Done: Implemented alpha version application.
           - I'm not going to use callback function since callback function in Stk library doesn't go well with Windows platform.
           - I changed string matching algorithm to modified LCS which is more suitable in my project.
           - I made database composed of 10 songs.
To do: Show several result songs rather than just one song matched.
           Try to deal with fast transition in voice input.
           Start beta testing!

WEEK10

Done: Implemented Query by Humming System.
To do: Make it possible to be used in commercial way. =)
           

Presentation file download

 

<Screen Shot>