Difference between revisions of "Njbml"

From CCRMA Wiki
Jump to: navigation, search
(additional, potential references)
 
(8 intermediate revisions by the same user not shown)
Line 1: Line 1:
'''CS229 Machine Learning Project Proposal'''
+
[http://ccrma.stanford.edu/~njb/cs229/ Please go here]
 
+
To implement and use Hidden-Markov Models (HMMs) for the for the purpose of understanding, transcribing, and modeling the musical parameters of an acoustic drum set.
+
 
+
The focus of the project will consist of surveying current techniques of HMMs used in automatic speech recognition and apply similar methods to model each sub-instrument of the drum set including the bass drum, floor tom, snare, mid-toms, hi-hat, crash cymbal, and ride cymbal. The training of the instrument models will be performed using time and frequency parameters of pre-recorded isolated drum sounds, which can easily be obtained via drum-sample libraries used in music production for negligible cost.  The first-stage results should be able to accurately and independently identify each sub-instrument.
+
 
+
Once successful models of each sub-instrument are obtained, a higher-level set of HMMs can be used to model a rhythmic pattern, where a single state of the HMM model would represent a single musical beat or duration of time.  The output of the first level can help train the second and eventually vice versa.  Multiple levels of HMMs can be used to break down the larger problem of instrument/rhythm transcription, similar to speech recognition where one level is used to model the multiple phonemes of the vocal tract and the next models individual words of a given language. A single musical beat can be represented as a linear combination of the sub-instruments and trained on entire measures of prerecorded material as opposed to single instrument samples. Because of the almost unlimited variability of rhythmic patterns, training should focus on a single genre of pattern as proof of concept and then move towards a more generalized set of pattern models. The second-stage results should be able to simultaneously identify combinations of the drum sub-instruments in reference to time. See figure below.
+
 
+
[[Image:Block_diagram.JPG]]
+
+
The motivation behind understanding and transcribing acoustic drum set sounds and patterns can be found throughout music information retrieval, music education, music production, and live human-machine music interaction performance.  With respect to music information retrieval, this work could eventually be used for an entire model of a pop song, genre classification, and more.  For music education purposes, students could play alongside a software program using the model and obtain feedback on their performance.  In music production, large databases of audio samples and recordings could be more easily sorted and automatically identified knowing the rhythmic patterns.  With respect to live human-machine music interaction, the rhythm model could be used in suggesting an accompanying drum pattern for a given piece of work, help compose new rhythmic variations automatically, or improvise alongside a human performer. 
+
 
+
 
+
'''People'''
+
 
+
Students: Nicholas J. Bryan
+
 
+
'''Advisors'''
+
 
+
Professors: Ge Wang, Andrew Ng
+
 
+
 
+
 
+
[http://www.stanford.edu/class/cs229/materials/projectGuidelines.pdf '''Project Guidelines''']
+
 
+
''Due Dates''
+
 
+
Milestone: 12:00pm Friday, 11/16
+
Poster Presentation: Morning of Wednesday, 12/12
+
Final Writeup: 12:00am Friday 12/14
+
 
+
 
+
 
+
'''Papers to Read/Concepts to Understand'''
+
 
+
HMMs and how they are used for speech recognition
+
 
+
How the knowledge of HMMs/speech recognition can be applied to rhythm understanding
+
 
+
 
+
 
+
[http://www.ece.ucsb.edu/Faculty/Rabiner/ece259/Reprints/tutorial%20on%20hmm%20and%20applications.pdf  A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition]
+
 
+
[http://www.cse.ogi.edu/class/cse552/ Hidden Markov Models for Speech Recognition OGI School of Science & Engineering Course Website]
+
 
+
[http://www.cus.cam.ac.uk/~nc272/papers/nickcollinsphd.pdf Towards Autonomous Agents for Live Computer Music: Realtime Machine Listening and Interactive Music Systems]
+
 
+
[http://www.ee.columbia.edu/~dpwe/e6820/outline.html  Dan Ellis's Speech and Audio Processing and Recognition Course]
+
 
+
 
+
 
+
'''ISMIR Papers of Relevance'''
+
 
+
[http://ismir2007.ismir.net/proceedings/ISMIR2007_p297_antonopoulos.pdf Music Retrieval By Rhythmic Similarity Applied on Greek and African Traditional Music]
+
 
+
[http://ismir2002.ismir.net/proceedings/02-FP02-3.pdf Pattern Discovery Techniques for Music Audio]
+
 
+
[http://ismir2004.ismir.net/proceedings/p043-page-232-paper236.pdf Understanding Search Performance in Query-By-Humming Systems]
+
 
+
[http://ismir2004.ismir.net/proceedings/p033-page-164-paper226.pdf Casual Tempo Tracking of Audio]
+
 
+
[http://ismir2004.ismir.net/proceedings/p101-page-554-paper247.pdf Eigenrhythms: Drum Pattern Basis Sets For Classification and Generation]
+
 
+
[http://ismir2002.ismir.net/proceedings/03-SP02-1.pdf Audio Retrieval by Rhythmic Similarity]
+
 
+
[http://ismir2004.ismir.net/proceedings/p035-page-178-paper215.pdf Beat and Meter Extraction Using Gaussified Onsets]
+
 
+
[http://ismir2005.ismir.net/proceedings/1061.pdf Drum Track Transcription of Polyphonic Music Using Noise Subspace Projection]
+
 
+
[http://ismir2007.ismir.net/proceedings/ISMIR2007_p219_gillet.pdf Supervised and Unsupervised Sequence Modeling for Drum Transcription]
+
 
+
[http://ismir2006.ismir.net/PAPERS/ISMIR06Gouyon-Tutorial.pdf ISMIR 2006 Tutorial: Computational Rhythm Description]
+
 
+
[http://ismir2004.ismir.net/proceedings/p030-page-150-paper167.pdf Extraction of Drum Patterns and Their Description within the MPEG-7 High-Level Framework]
+
 
+
[http://ismir2005.ismir.net/proceedings/1064.pdf Continuous HMM and its Enhancement for Signing/Humming Query Retrieval]
+
 
+
[http://ismir2005.ismir.net/proceedings/1031.pdf Rhythm-Based Segmentation of Popular Chinese Music]
+
 
+
[http://ismir2002.ismir.net/proceedings/02-FP01-3.pdf Indexing Hidden Markov Models for Music Retrieval]
+
 
+
[http://ismir2005.ismir.net/proceedings/1109.pdf New Music Interfaces for Rhythm-Based Retrieval]
+
 
+
[http://ismir2007.ismir.net/proceedings/ISMIR2007_p127_lartillot.pdf MIR in Matlab (II): A Toolbox for Musical Feature Extraction from Audio]
+
 
+
[http://ismir2004.ismir.net/proceedings/p054-page-289-paper154.pdf Pattern Matching in Polyphonic Music as a WEighted Geometric Translation Problem]
+
 
+
[http://ismir2003.ismir.net/papers/Meek.PDF The dangers of parsimony in query-by-humming applications]
+
 
+
[http://ismir2005.ismir.net/proceedings/1048.pdf An Investigation of Feature Models for Music Genre Classification Using the Support Vector Classifier]
+
 
+
[http://ismir2003.ismir.net/papers/Parry.PDF Rhythmic Similarity through Elaboration]
+
 
+
[http://ismir2007.ismir.net/proceedings/ISMIR2007_p353_moreau.pdf Drum Transcription in Polyphonic Music Using Non-Negative Matrix Factorisation]
+
 
+
[http://ismir2004.ismir.net/proceedings/p100-page-550-paper157.pdf A Drum Pattern Retrieval Method By Voice Percussion]
+
 
+
[http://ismir2007.ismir.net/proceedings/ISMIR2007_p519_seyerlehner.pdf From Rhythm Patterns to Perceived Tempo]
+
 
+
[http://ismir2006.ismir.net/PAPERS/ISMIR0683_Paper.pdf Joint Beat & Tatum Tracking From Music Signals]
+
 
+
[http://ismir2004.ismir.net/proceedings/p097-page-537-paper150.pdf Percussion Classification in Polyphonic Audio Recordings Using Localized Sound Models]
+
 
+
[http://ismir2004.ismir.net/proceedings/p066-page-357-paper250.pdf Rhythm and Tempo Recognition of music Performance From a Probabilistic Approach]
+
 
+
[http://ismir2004.ismir.net/proceedings/p045-page-242-paper134.pdf A comparison of Rhythmic Similarity Measures]
+
 
+
[http://ismir2004.ismir.net/proceedings/p034-page-170-paper217.pdf Query-by-beat -boxing: Music Retreival for the DJ]
+
 
+
[http://ismir2007.ismir.net/proceedings/ISMIR2007_p293_volk.pdf Applying Rhythmic Similarity Based on Inner Metric Analysis to Folksong Research]
+
 
+
 
+
== additional, potential references ==
+
[http://citeseer.ist.psu.edu/133145.html tony verma et al: system for transient detection]
+
 
+
 
+
== Software Tools ==
+
[http://htk.eng.cam.ac.uk/ Hidden Markov Model Toolkit (HTK)]
+
HTK is primarily used for speech recognition research
+
 
+
[http://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.html Hidden Markov Model (HMM) Toolbox for Matlab]
+

Latest revision as of 21:04, 31 December 2007

Please go here