Jump to Navigation

Main menu

  • Login
Home

Secondary menu

  • [Room Booking]
  • [Wiki]
  • [Webmail]

Audio Understanding and Room Acoustics in the Era of AI

Date: 
Fri, 10/14/2022 - 3:30pm - 4:20pm
Location: 
CCRMA Classroom [Knoll 217]
Event Type: 
DSP Seminar
Abstract: This talk will aim to bridge the gap between signal processing and the latest machine learning research by discussing several applications in music and audio. In the first part of the talk, we will discuss how classic signal processing properties can be used to spoon-feed powerful neural architectures such as Transformers to tackle a difficult signal processing task: To do re-reverberation(system identification) at scale. This work now enables hearing music in any concert hall/virtual environment for any music. We use arbitrary audio recorded as an approximate proxy for a balloon pop, thus removing the need for them to measure room acoustics. This work has enormous applications in Virtual/Augmented Reality and the Metaverse if it happens!

In the second part of the talk, we will talk about neural architectures without any convolution or recurrence and discuss how Transformer architectures have revolutionized machine listening. We will discuss how we can use classic signal processing ideas, such as wavelets and powerful Transformer architectures, to get significant gains, which each could not do individually. We would also talk about learning time-frequency representations different than classic Fourier representations.

Finally, if time permits, we would go back in time and explore how one could build state-of-the-art architecture without having access to the tools at our disposal. Can we still do machine listening without advancements like attention, transformers, convolutions, and recurrence? We show that one can still do a decent job with simple neural architectures developed back in 2006 and simple counting statistics, beating all previous architectures even as late as 2019! :)

The talk will be self-contained. People have taken the first course in audio/signal processing (what spectrograms are), and knowing a small amount of machine learning(what feed-forward neural networks are) should be sufficient. That being said, it will cover novel and state-of-the-art concepts and methods, and there will be something even for experienced graduate researchers/ faculties from music/NLP/signal processing/audio/AI fields.

This work was done with Chris Chafe and Jonathan Berger at CCRMA. In addition, we thank the Institute of Human-Centered AI at Stanford University (Stanford HAI) for supporting this work.

Bio: Prateek Verma is currently a researcher at Stanford working at the intersection of music/signal processing/audio/optimization/neural architectures. He got his Master's degree from Stanford, and before that, he was at IIT Bombay. He loves biking, hiking, and playing sports.
FREE
Open to the Public
  • Calendar
  • Home
  • News and Events
    • All Events
      • CCRMA Concerts
      • Colloquium Series
      • DSP Seminars
      • Hearing Seminars
      • Guest Lectures
    • Event Calendar
    • Events Mailing List
    • Recent News
  • Academics
    • Courses
    • Current Year Course Schedule
    • Undergraduate
    • Masters
    • PhD Program
    • Visiting Scholar
    • Visiting Student Researcher
    • Workshops 2022
  • Research
    • Publications
      • Authors
      • Keywords
      • STAN-M
      • Max Mathews Portrait
    • Research Groups
    • Software
  • People
    • Faculty and Staff
    • Students
    • Alumni
    • All Users
  • User Guides
    • New Documentation
    • Booking Events
    • Common Areas
    • Rooms
    • System
  • Resources
    • Planet CCRMA
    • MARL
  • Blogs
  • Opportunities
    • CFPs
  • About
    • The Knoll
      • Renovation
    • Directions
    • Contact

Search this site:

Winter Quarter 2023

101 Introduction to Creating Electronic Sound
158/258D Musical Acoustics
220B Compositional Algorithms, Psychoacoustics, and Computational Music
222 Sound in Space
250C Interaction - Intermedia - Immersion
251 Psychophysics and Music Cognition
253 Symbolic Musical Information
264 Musical Engagement
285 Intermedia Lab
319 Research Seminar on Computational Models of Sound
320B Introduction to Audio Signal Processing Part II: Digital Filters
356 Music and AI
422 Perceptual Audio Coding
451B Neuroscience of Auditory Perception and Music Cognition II: Neural Oscillations

 

 

 

   

CCRMA
Department of Music
Stanford University
Stanford, CA 94305-8180 USA
tel: (650) 723-4971
fax: (650) 723-8468
info@ccrma.stanford.edu

 
Web Issues: webteam@ccrma

site copyright © 2009 
Stanford University

site design: 
Linnea A. Williams