Large-Scale Content-Based Matching of Audio and MIDI Data
Colin Raffel, former CCRMA MA/MST student, current PhD student at LabROSA
MIDI files can be used as ground truth for many music information retrieval tasks, including automatic transcription, onset detection, and chord recognition. However, each MIDI file must first be matched and aligned to an audio recording of the song it was transcribed from. We present a system which can efficiently match and align MIDI files to entries in a large corpus of audio content without using any metadata. Our system implements a toolchain of techniques including MIDI to audio alignment with confidence reporting, cross-modality similarity-preserving hashing, fast time series searching, and ground truth extraction from MIDI data. In this talk, we discuss the details of our system and its application to the task of matching a huge corpus of MIDI files to the Million Song Dataset.