Difference between revisions of "MIR workshop 2008 notes"

From CCRMA Wiki
Jump to: navigation, search
(Feature Extraction)
(Analysis / Decision Making)
Line 34: Line 34:
  
 
= Analysis / Decision Making =
 
= Analysis / Decision Making =
== Classification ==  
+
== Classification ==
=== Heuristic Analysis ===
+
* Heuristic Analysis  
=== Distance measures (Euclidean, Manhattan, etc.) ===
+
* Distance measures (Euclidean, Manhattan, etc.)  
=== k-NN ===
+
* k-NN
=== SVM / One-class SVM ===
+
* SVM / One-class SVM
====Resources====
+
** Resources:
* [http://homepages.cae.wisc.edu/~ece539/matlab/ The interactive Matlab SVM Demo that I demonstrated on Lecture 5 comes from here]
+
*** [http://homepages.cae.wisc.edu/~ece539/matlab/ The interactive Matlab SVM Demo that I demonstrated on Lecture 5 comes from here]
* [http://www.eee.metu.edu.tr/~alatan/Courses/Demo/AppletSVM.html A nice SVM java applet to demo the concepts]
+
*** [http://www.eee.metu.edu.tr/~alatan/Courses/Demo/AppletSVM.html A nice SVM java applet to demo the concepts]
* [http://www.autonlab.org/tutorials/svm15.pdf Andrew Moore's SVM Powerpoint Lecture]
+
*** [http://www.autonlab.org/tutorials/svm15.pdf Andrew Moore's SVM Powerpoint Lecture]
* [http://www.kernel-machines.org/ User community of SVM enthusiasts]
+
*** [http://www.kernel-machines.org/ User community of SVM enthusiasts]
* [http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf A practical guide to SVM classification]
+
*** [http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf A practical guide to SVM classification]
* [http://www.kyb.tuebingen.mpg.de/bs/people/weston/svmpractical/ SVM Practical (How to get good results without cheating)]
+
*** [http://www.kyb.tuebingen.mpg.de/bs/people/weston/svmpractical/ SVM Practical (How to get good results without cheating)]
* [https://list.scms.waikato.ac.nz/pipermail/wekalist/2006-November/008533.html One-class SVM posting]
+
*** [https://list.scms.waikato.ac.nz/pipermail/wekalist/2006-November/008533.html One-class SVM posting]
 
+
** Code:
====Code====
+
*** [http://www.csie.ntu.edu.tw/~cjlin/libsvm/ libSVM (standalone, matlab, c, etc)]
* [http://www.csie.ntu.edu.tw/~cjlin/libsvm/ libSVM (standalone, matlab, c, etc)]
+
*** [http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/ libSVM tools]
* [http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/ libSVM tools]
+
  
 
==  Clustering and probability density models ==  
 
==  Clustering and probability density models ==  
=== Density distance measures (centroid distance, EMD, KL-divergence, etc) ===
+
* Density distance measures (centroid distance, EMD, KL-divergence, etc)  
=== k-Means ===
+
* k-Means ===
 
* [http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html Clustering Demo]
 
* [http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/AppletKM.html Clustering Demo]
  
== Clustering ==  
+
=== Clustering ===  
=== GMM ===
+
* GMM
* [http://www.inf.ed.ac.uk/teaching/courses/inf2b/learnnotes/inf2b-learn02-notes.pdf Simple review of probability with introduction of Bayes Rules ]  
+
** [http://www.inf.ed.ac.uk/teaching/courses/inf2b/learnnotes/inf2b-learn02-notes.pdf Simple review of probability with introduction of Bayes Rules ]  
* [http://en.wikipedia.org/wiki/Conditional_probability Good description of conditional probabilities]
+
** [http://en.wikipedia.org/wiki/Conditional_probability Good description of conditional probabilities]
* [http://crow.ee.washington.edu/people/bulyko/papers/em.pdf EM explained]
+
** [http://crow.ee.washington.edu/people/bulyko/papers/em.pdf EM explained]
* [http://www.cs.cmu.edu/~alad/em/ Expectation-Maximization Java Applet]
+
** [http://www.cs.cmu.edu/~alad/em/ Expectation-Maximization Java Applet]
* [http://www.ee.columbia.edu/~dpwe/muscontent/ Lab featuring real-world GMM examples for singing detection]
+
** [http://www.ee.columbia.edu/~dpwe/muscontent/ Lab featuring real-world GMM examples for singing detection]
* [http://www.ee.columbia.edu/~dpwe/e6820/outline.html Dan Ellis' Speech and Audio Processing Lectures]
+
** [http://www.ee.columbia.edu/~dpwe/e6820/outline.html Dan Ellis' Speech and Audio Processing Lectures]
  
 
=== HMM  ===
 
=== HMM  ===
Line 72: Line 71:
 
* [http://www.mathworks.com/access/helpdesk/help/toolbox/stats/index.html?/access/helpdesk/help/toolbox/stats/f8368.html&http://www.google.com/search?q=As+an+example%2C+consider+a+Markov+model+with+two+states+and+six+possible+emissions.+The+model+uses%3A&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a Matlab Introduction to HMM functions]
 
* [http://www.mathworks.com/access/helpdesk/help/toolbox/stats/index.html?/access/helpdesk/help/toolbox/stats/f8368.html&http://www.google.com/search?q=As+an+example%2C+consider+a+Markov+model+with+two+states+and+six+possible+emissions.+The+model+uses%3A&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a Matlab Introduction to HMM functions]
  
==  Nested classifier / Anchor-space / template-based systems ==  
+
==  Nested classifier / Anchor-space / template-based systems ==
 +
* ?
  
 
= Model / Data Preparation Techniques =
 
= Model / Data Preparation Techniques =

Revision as of 10:44, 12 July 2010

This page is intended to supplement the lecture material found in the class - providing extra tutorials, support, references for further reading, or demonstration code snippets for those interested in a given topic. Please contribute to this growing list of resources. Do you have a great explanation of how a technique works? Found a great Java applet that illustrates a concept? Discovered a great survey of the field for a particular area? Please add it for the benefit of future students. Thanks!

I encourage you to ADD links and sections - but please do not REMOVE headings or items from the page.

Timing and Segmentation

Onset Detection

  • Papers:
  • Code:

Beat Extraction

  • Papers:
  • Code:

Tempo Extraction

Feature Extraction

Low Level Features

  • Zero Crossing, Temporal centroid, Log Attack time, Attack slope), Spectral features (Centroid, Flux, RMS, Rolloff, Flatness, Kurtosis, Brightness),Spectral bands, Log spectrogram
  • Chroma bins
  • MFCC
  • MPEG-7

Higher-level features

  • Key Estimation
  • Chord Estimation
  • Genre (genre, artist ID, similarity)
  • "Fingerprints"

Visualizing and Sonifying Feature data

Analysis / Decision Making

Classification

Clustering and probability density models

  • Density distance measures (centroid distance, EMD, KL-divergence, etc)
  • k-Means ===
  • Clustering Demo

Clustering

HMM

Nested classifier / Anchor-space / template-based systems

  •  ?

Model / Data Preparation Techniques

Data Preparation

PCA / LDA

Scaling data

Model organization

  • concept, design, data set construction and organization

Evaluation Methodology

Feature selection

Cross Validation

Information Retrieval metrics (precision, recall, F-Measure)

Real-world applications

Audio Segmentation

Automatic Audio Segmentation: Segment Boundary and Structure Detection in Popular Music

Audio Fingerprinting

The Last.fm fingerprinter uses this approach, code can be checked out from: svn://svn.audioscrobbler.net/recommendation/MusicID/lastfm_fplib

Drum Transcription

Audio Similarity

Music Recommendation / Playlisting

Getting Involved in the MIR Community

Research Databases / Collections of Ground truth data and copyright-cleared music


General MIR Datasets


Download links for the ISMIR 2004 genre classification contest training set:


Tags:

More:


From Georg Holzmann: 
LIST OF PUBLIC AVAILABLE MIR DATASETS
Downloadable Datasets:
- University of Iowa musical instruments samples:
   http://theremin.music.uiowa.edu/MIS.html
   Instrument samples recorded by the University of Iowa

- ISMIR2004 Audio Description Contest Dataset:
   http://ismir2004.ismir.net/ISMIR_Contest.html
   Datasets for
   - Genre Classification/Artist Identification
   - Melody Extraction
   - Tempo Induction
   - Rhythm Classification

- Graham's Melody Extraction Dataset:
   http://www.ee.columbia.edu/~graham/mirex_melody/
   http://labrosa.ee.columbia.edu/projects/melody/
   Audio files with correspondig pitch data

- MIREX06 Audio Tempo Extraction and Beat Tracking Datasets:
 
http://www.music-ir.org/mirex/2006/index.php/Audio_Tempo_Extraction#Practice_Data

- QBSH: A Corpus for Designing QBSH (Query by Singing/Humming) Systems
   http://neural.cs.nthu.edu.tw/jang2/dataSet/childSong4public/QBSH-corpus/

- Uni Dortmund Music Audio Benchmark Data Set:
   http://www-ai.cs.uni-dortmund.de/audio.html
   Songs from different genres and with tags (from garageband.com)

- Latin Music Database:
   http://www.ppgia.pucpr.br/~silla/lmd/
   3.160 music pieces in MP3 Format classified in 10 diferent musical genres
   (only features online)


Orderable Datasets:
- RWC Music Database:
   http://staff.aist.go.jp/m.goto/RWC-MDB/
   (many CDs)
   Datasets for
   - Pop Music & Royalty-Free Music
   - Classical Music
   - Jazz Music
   - Music Genre
   - Musical Instrument Sound

   Additional: AIST RWC Annotations
   http://staff.aist.go.jp/m.goto/RWC-MDB/AIST-Annotation/
   Additional annotations to the RWC database (beat, melody, ...)

- McGill University Master Samples:
   http://www.music.mcgill.ca/resources/mums/html/
   3 DVDs with instrument samples

- USPOP2002 Pop Music data set:
   http://labrosa.ee.columbia.edu/projects/musicsim/uspop2002.html
   (3 DVDs)
   MFCC features from 706 albums and 8764 tracks (400 artists)
   with style tags

- ENST-Drums:
   http://perso.telecom-paristech.fr/~gillet/ENST-drums/
   An extensive audio-visual database for drum signals processing


Free Online Music:

- magnatune.com creative commons music:
   http://magnatune.com/info/press/coverage/ccblog

- http://www.garageband.com/
   Public domain recordings

- http://epitonic.com/
   "high quality free and legal mp3 music"

- http://www.jamendo.com/
   Creative commons licensed music

- http://musicbrainz.org/
   Get music metadata

- http://www.freesound.org/
   Collaborative database of Creative Commons licensed sounds
   (not focused on songs)

Webservices:
- Networked Environment for Music Analysis:
   http://nema.lis.uiuc.edu/
   A webservices system for submitting code, running it against virtual collections
   (full use in 2010)

- MIREX DIY Framework:
   http://www.music-ir.org/mirexdiy/
   http://www.dlib.org/dlib/december06/downie/12downie.html
   (useable ?)

MIR Software and Toolboxes

Incomplete but growing list (courtesy of Joern Loviscach):
* MARSYAS
* jAudio
* Chuck
* The Sonic Visualizer/Annotator
* CLAM
* Music-to-Knowledge (M2K)
* MIRtoolbox
* MA toolbox
* Psysound
* Praat
* IPEM
* EchoNest
* libxtract
* MuBu
* Soundspotter
* timbreID
* openSMILE
* MPEG-7 XM
* MPEG-7 Audio Encoder
* MPEG-7 Audio Analyzer
* Sphinx 4 - Java-based open-source speech recognizer  http://cmusphinx.sourceforge.net/sphinx4/#capabilities

MIR Topic Areas

From Simon Dixon, Music-IR list, Dec 2008.

MIR Systems
- Content-based Querying
- Classification (genre/style/mood)
- Recommendation / playlist generation
- Fingerprinting / DRM
- Score following / Audio alignment
- Transcription / Annotation
- Tempo induction / Beat tracking
- Summarisation
- Streaming
- Text/web mining
- Optical music recognition
- Database systems / indexing / query languages

Human issues
- user interfaces, user models
- emotion, aesthetics
- perception, cognition
- social issues
- legal and ethical issues
- business issues
- methodological and philosophical issues

Data and metadata
- audio
- MIDI
- score
- text/web
- KR schemes, standards and protocols
- libraries and collections
- test sets and evaluation

Musical knowledge
- Melody and motives
- Harmony, chords and tonality
- Rhythm, beat, tempo and form
- Timbre, instrumentation and voice
- Genre, style and mood
- Performance
- Composition
- Ethnomusicology