Jump to Navigation

Main menu

  • Login
Home

Secondary menu

  • [Room Booking]
  • [Wiki]
  • [Webmail]

Deep Learning for Music Information Retrieval I: How Neural Networks Learn Audio

Workshop Date: 
Mon, 08/14/2023 - Fri, 08/18/2023


This workshop will cover the industry-standard methods to develop deep neural network architectures for digital audio using PyTorch. Throughout five immersive days of study, we will cover theoretical and practical principles that deep learning researchers use everyday in the real world. Our schedule will be:

 
Day 1: Cross entropy and feedforward neural networks
Math - Linear algebra and differential calculus review. The mathematics of feedforward neural networks. Activation functions. Batch Norm.
Theory - How synaptic neuroplasticity inspired the backpropagation algorithm.
Practice - Automating differentiation in a neural network with PyTorch.

Day 2: Dimension reduction techniques for audio
Theory - Dimensionality reduction. Principal Component Analysis. Autoencoders.
Practice a) - Finding interpretable features in the Tinysol and EGFxSet datasets with PCA.
Practice b) - Writing an autoencoder to denoise audio in PyTorch.

Day 3: Convolutional neural networks
Theory - convolution, optimizers and momentum, Loss functions.
Practice - writing a CNN for music genre classification

Day 4: Temporal encoding with RNN, GRU, and WaveNet
Theory - Architecture and data flows on a Gated Recurrent Unit (GRU).
Practice a) - Writing an RNN and a GRU in PyTorch and using it for sound event classification.
Practice b) - Reading the seminal WaveNet paper

Day 5: Generative Models
Theory - Kulback-Leibler divergence. Probability review, Variational autoencoders. Self-attention.
Practice - writing a VAE to use its latent space to generate parameters for an audio synthesizer.


Enrollment Options:

In-person (CCRMA, Stanford) and online enrollment options available during registration (see red button above). Students will receive the same teaching materials and have access to the same tutorials in either format. In-person students will gain access to more in-depth, hands-on 1:1 instructor discussion and feedback when taking the course in-person. 

About the instructors:

 


Iran R. Roman is a theoretical neuroscientist and machine listening scientist at New York University’s Music and Audio Research Laboratory. Iran is a passionate instructor, with extensive experience teaching artificial intelligence and deep learning. His industry experience includes deep learning engineering internships at Plantronics in 2017, Apple in 2018 and 2019, Oscilloscape in 2020, and Tesla in 2021. Iran’s research has focused on using deep learning for speech recognition and auditory scene analysis. iranroman.github.io

scholarship opportunity:

https://docs.google.com/forms/d/e/1FAIpQLSdL4LWoX5EpYUEp0UMFUhhmgMWOHkd8VlF70G9BK8e3-AfX2w/viewform

  • Home
  • News and Events
    • All Events
      • CCRMA Concerts
      • Colloquium Series
      • DSP Seminars
      • Hearing Seminars
      • Guest Lectures
    • Event Calendar
    • Events Mailing List
    • Recent News
  • Academics
    • Courses
    • Current Year Course Schedule
    • Undergraduate
    • Masters
    • PhD Program
    • Visiting Scholar
    • Visiting Student Researcher
    • Workshops 2023
  • Research
    • Publications
      • Authors
      • Keywords
      • STAN-M
      • Max Mathews Portrait
    • Research Groups
    • Software
  • People
    • Faculty and Staff
    • Students
    • Alumni
    • All Users
  • User Guides
    • New Documentation
    • Booking Events
    • Common Areas
    • Rooms
    • System
  • Resources
    • Planet CCRMA
    • MARL
  • Blogs
  • Opportunities
    • CFPs
  • About
    • The Knoll
      • Renovation
    • Directions
    • Contact

Search this site:

Fall Courses at CCRMA

Music 101 Introduction to Creating Electronic Sounds
Music 192A Foundations in Sound Recording Technology
Music 201 CCRMA Colloquium
Music 220A Foundations of Computer-Generated Sound
Music 223A Composing Electronic Sound Poetry
Music 256A Music, Computing, and Design I: Software Paradigms for Computer Music
Music 319 Research Seminar on Computational Models of Sound Perception
Music 320 Introduction to Audio Signal Processing
Music 351A Research Seminar in Music Perception and Cognition I
Music 423 Graduate Research in Music Technology
Music 451A Auditory EEG Research I

 

 

 

   

CCRMA
Department of Music
Stanford University
Stanford, CA 94305-8180 USA
tel: (650) 723-4971
fax: (650) 723-8468
info@ccrma.stanford.edu

 
Stanford Digital Accessibility
Web Issues: webteam@ccrma
site copyright © 2009-2023
Stanford University

site design: 
Linnea A. Williams