Intro to Music Information Retrieval
Intro to Music Information Retrieval
This workshop will teach the underlying concepts, approaches, technologies, and practical applications of audio systems using Music Information Retrieval (MIR) algorithms. This is the first of a three-workshop series, students may choose to enroll in just this workshop, the first two, or the full sequence.
MIR is an interdisciplinary field with exciting research topics and practical applications to the audio and technology industries. MIR bridges music and technology, using knowledge from the domains of audio signal processing, machine learning, software design, music theory, and more to approach cross-disciplinary tasks. MIR algorithms allow a computer to "listen" and extract information from audio data, enabling systems to do tasks like music recommendations, sorting, searching, transcription, and generating music.
This course will cover
• A broad introduction to common topics in MIR
• Practical tools for working on MIR tasks in Python
• Introduction to the current state of the field and remaining challenges.
Prerequisites
• Programming experience equivalent to 1 year of undergraduate computer science. We will be using Python.
• Basic knowledge of signal processing
• Basic knowledge of music theory
Topics
• Music Formats, Time-Frequency Representations
• Novelty Detection
• Sound Classification
• Pitch Tracking
• Beat Tracking
• Chord Recognition
• Music Similarity
• MIR Applications
Day 1
Music Formats: Sheet Music, MIDI, Hz, wav/mp3 files
Python Review, Intro to libraries, Librosa
Waveforms, DFT, Spectrograms
Day 2
Onset detection, novelty detection
MFCCs- genre categorization project
Day 3
Pitch Tracking (Spectrogram, YIN, Crepe)
Beat Tracking
Day 4
Chord Recognition, Chroma
Human- Centered MIR, Human Subjects Research
Day 5
MIR Applications: Music recommendation, audio fingerprint
Open Time, Review with Instructors, Q&A
About the instructors:
Elena Georgieva Elena is a PhD student and researcher at NYU’s Music and Audio Research Lab (MARL). Before joining MARL, Elena taught sound recording and managed the recording studio at CCRMA, where she completed her masters degree in Music Science in Technology degree in 2019. Elena has expertise in music information retrieval, machine learning, sound recording, and vocals. Elena has presented her work at the ISMIR and ICML conferences, at Stanford and Berkeley, as well as at several tech companies. www.elenatheodora.com
Iran R. Roman is a theoretical neuroscientist and machine listening scientist at New York University’s Music and Audio Research Laboratory. Iran is a passionate instructor, with extensive experience teaching artificial intelligence and deep learning. His industry experience includes deep learning engineering internships at Plantronics in 2017, Apple in 2018 and 2019, Oscilloscape in 2020, and Tesla in 2021. Iran’s research has focused on using deep learning for speech recognition and auditory scene analysis. iranroman.github.io