Deep Learning for MIR II: State-of-the-art Algorithms
A survey of cutting-edge research in MIR using Deep Learning presented by instructors and a lineup of guest speakers leading research in industry and academia. This workshop is meant for individuals who want to gain experience applying Deep Learning to solve a problem of their interest in MIR. Instructors will explain and demonstrate concepts in models that are used in cutting-edge industry and academic research. Students will build and train state-of-the-art models using tensorflow and GPU computing, adapting them to a MIR-related problem of their interest. Instructors will serve as advisors to students in the course on-demand.
In-person (CCRMA, Stanford) and online enrollment options available during registration (see red button above). Students will receive the same teaching materials and have access to the same tutorials in either format. In-person students will gain access to more in-depth, hands-on 1:1 instructor discussion and feedback when taking the course in-person.
Theory includes: Generative models. Self-supervised feature learning. Attention mechanisms.
Models covered includes: DeepSpeech, Transformer, Crepe, GrFNNs
Practice: music and speech recognition/synthesis, beat-tracking, music-recommendation, and semantic analysis.
Prerequisites:
- Deep Learning for MIR I Workshop (August 8 - 12, 2022)
About the instructors:
Camille Noufi is a PhD student and researcher at the Center for Computer Research in Music and Acoustics (CCRMA) at Stanford University. Camille studies machine generation of expressive communication, and acoustic impact of the environment on the voice. Her interdisciplinary research utilizes signal processing (DSP), machine learning (ML) and human-computer-interaction (HCI) in combination with psychology and vocal science. She was a research intern in the Audio Team at Meta Reality Labs in 2020. Before coming to CCRMA, she worked on audio scene analysis and vocal biomarker research at MIT Lincon Laboratory. Her research has been presented at the Interspeech, ISMIR and ICML conferences. camillenoufi.com
Iran R. Roman is a theoretical neuroscientist and machine listening scientist at New York University’s Music and Audio Research Laboratory. Iran is a passionate instructor, with extensive experience teaching artificial intelligence and deep learning. His industry experience includes deep learning engineering internships at Plantronics in 2017, Apple in 2018 and 2019, Oscilloscape in 2020, and Tesla in 2021. Iran’s research has focused on using deep learning for speech recognition and auditory scene analysis. iranroman.github.io
IMPORTANT: Contact the instructor before registering to confirm your eligibility. Attach a copy of your registration or diploma for the CCRMA Deep Learning for MIR I workshop.
scholarship opportunity: https://docs.google.com/forms/d/e/1FAIpQLSdL4LWoX5EpYUEp0UMFUhhmgMWOHkd8...