Dong
In Lee
220C Project - Query by
Humming System
AIM
This project aims to implement query by
humming system. The program will get real time voice input. Then, it will show
relevant song's name. I will research about computational auditory scene
analysis based on papers of Guy Brown, Daniel Ellis, and so on.
WEEK1
Done: Determined project subject
To do: Study about similar tool (Praat)
Read about previous research
papers on pitch detection algorithms
WEEK2
Done: Skimmed over Computational Auditory
Scene Analysis edited by DeLiang Wang and J. Brown
To do: Select one method for pitch detection
WEEK3
Done: Read through Alain de Cheveigne &
Hideki Kawahara's paper - YIN, a
fundamental frequency estimator for speech and music
To do: Implement this method on Matlab
Verify if FFTW can be used on
Microsoft Visual Studio .NET
WEEK4
Done: Implemented
basic pitch detection function on Matlab
To do: Determine
pitch range.
- Especially lower bound. What should be the lowest note of the
transcript?
Think about extension. I have
more time. =)
- Query by humming system
- Multiple pitch detection
for auto transcription
WEEK5
Done:
Implemented a function that calculates the distance between two string.
- Worked for short
strings. But the running time increased with longer strings.
Read about papers on Query by humming
system.
- One of them (Note
interval) is similar to the algorithm I implemented above.
Tried to use
Stk library on Visual Studio .NET 2003 environment for implementing real time
input.
- I spent A LOT OF time to find out
how to use them. Somebody help me!
To do: Find
someone who knows about Stk library porting to windows ASAP.
Choose mapping scheme from
input sound stream to discrete note representation.
Determine the format of DB
records and Primary key.
WEEK6
Done:
Improved the functionality of string match algorithm.
- Couldn't use the
algorithm from <Theme Finder>
because it's not a noise robust algorithm.
- Used dynamic programming.
O(mn) guaranteed. Last time, there existed redundant
recursive calls. That's why you saw sluggish running time execution. =)
Solved the problem with using
Stk in VS.NET.
- Gary gave me some tips. Thanks!
Implemented
pitch to note algorithm.
- Need some experiments.
Tricky thing was to set the threshold value for transition from note to note.
Determined
mapping scheme from input sound to discrete note representation.
- I will use note's interval, not pitch itself, since people may sing in a key different from the original key.
- Ex) C C G G A A G => 0 7 0 2 0 -2 => 0 g 0 b 0 B
To do:
Implement GUI.
Merge these things
together.
Make sample Database.
WEEK7
Done:
Made sample Database
- I will use free midi
ringtone files on the web.
- I made a matlab code for
converting from midi note to valid string format.
- If there is any tune
you'd like to have played, just tell me. I will add them to the database. =)
Solved blocking problem related with real
time input on GUI.
- Used Stk callback
function. Now users can adjust the recording time with Button. But I will set
the limit:10 sec
To do: Complete
and embellish GUI.
WEEK8
Done:
Tested with GUI version
- There seemed to be bugs
when I converted from console version to GUI version. Pitch detection is sometimes unreliable.
- Callback function is very
sensitive. Maybe I will have to spend more time on that.
To do: Make alpha version application.
WEEK9
Done: Implemented alpha version application.
- I'm not going to use
callback function since callback function in Stk library doesn't go well with
Windows platform.
- I changed string matching algorithm to
modified LCS which is more suitable in my project.
- I made database composed
of 10 songs.
To do: Show several result songs rather than just one song matched.
Try to deal with fast transition
in voice input.
Start beta testing!
WEEK10
Done: Implemented Query by Humming System.
To do: Make it possible to be used in commercial way. =)
<Screen Shot>