Jump to Navigation

Main menu

  • Login
Home

Secondary menu

  • [Room Booking]
  • [Wiki]
  • [Webmail]

Programs and book for building an audio coder and for deep learning for audio

Date: 
Fri, 10/07/2022 - 3:30pm - 4:30pm
Location: 
CCRMA Classroom [Knoll 217]
Event Type: 
Hearing Seminar
This quarter's Hearing Seminar is dedicated to "What I did during my pandemic?"  First up is Prof. Gerald Schuller talking about the book and code he wrote to build better audio coders and deep neural networks.  This is a joint seminar with the DSP seminar so it will take place at 3:30PM in the classroom.

Abstract:
Audio coding became an ubiquitous tool for transmitting and storing audio signals, for instance as part of high quality teleconferencing, like with "Facetime" or similar, or for listening for music in almost every way. A recent tool for adaptive audio processing is deep learning, which so far is mostly used for image and speech processing, but increasinglich also for audio processing. All of those can be implemented and experimented with in Python, which allows for fast prototyping and also is the progamming language for deep learning.
This talk will present in a way “what I did during my pandemic”, Python tools and examples for building an audio coder, and examples for my tutorial on "Deep Learning for Audio". These tools are in public Github repositories together with Python Colab notebook descriptions, which is described in my new book on "Filter Banks and Audio coding - Compressing Audio Signals Using Python" (Springer), and in videos on YouTube.
Links to correspinding repositories are:
https://github.com/TUIlmenauAMS/AudioCoding_Tutorials
https://github.com/TUIlmenauAMS/Python-Audio-Coder
https://github.com/TUIlmenauAMS/AES_Tutorial_2021

Short Bio:
Gerald Schuller is a full professor at the Institute for Media Technology of the Technical University of Ilmenau, since 2008. He was head of the Audio Coding for Special Applications group of the Fraunhofer Institute for Digital Media Technology in Ilmenau, Germany, since January 2002 until 2008, and is now a member of Fraunhofer IDMT. Before joining the Fraunhofer Institute, he was a Member of Technical Staff at Bell Laboratories, Lucent Technologies, and Agere Systems, a Lucent Spin-off, from 1998 to 2001. There he worked in the Multimedia Communications Research Laboratory. He received his Diplom degree in Electrical Engineering from the Technical University of Berlin in 1989, and his Ph.D. (Dr.-Ing.) degree from the University of Hanover in 1997, studied at the Massachusetts Institute of Technology in 1989/90 and at the Georgia Institute of Technology in 1993. He was Associate Editor of the IEEE Transactions on Speech and Audio Processing from 2002 until 2006, and the IEEE Transactions on Signal Processing from 2006 to 2009, and of the IEEE Transactions on Multimedia from 2008 to 2011. He is recipient of the 2006 IEEE Best Paper Award in the Audio and Electroacoustics Area. His research interests are in filter banks, audio coding, music signal processing, and deep learning for multimedia. He is probably best known for his work on low delay filter banks, which became part of the MPEG-4 ELD-AAC audio coding standard, which is now part of the iOS and Android operating systems and is used for instance in the Facetime application.


FREE
Open to the Public
  • Calendar
  • Home
  • News and Events
    • All Events
      • CCRMA Concerts
      • Colloquium Series
      • DSP Seminars
      • Hearing Seminars
      • Guest Lectures
    • Event Calendar
    • Events Mailing List
    • Recent News
  • Academics
    • Courses
    • Current Year Course Schedule
    • Undergraduate
    • Masters
    • PhD Program
    • Visiting Scholar
    • Visiting Student Researcher
    • Workshops 2022
  • Research
    • Publications
      • Authors
      • Keywords
      • STAN-M
      • Max Mathews Portrait
    • Research Groups
    • Software
  • People
    • Faculty and Staff
    • Students
    • Alumni
    • All Users
  • User Guides
    • New Documentation
    • Booking Events
    • Common Areas
    • Rooms
    • System
  • Resources
    • Planet CCRMA
    • MARL
  • Blogs
  • Opportunities
    • CFPs
  • About
    • The Knoll
      • Renovation
    • Directions
    • Contact

Search this site:

Winter Quarter 2023

101 Introduction to Creating Electronic Sound
158/258D Musical Acoustics
220B Compositional Algorithms, Psychoacoustics, and Computational Music
222 Sound in Space
250C Interaction - Intermedia - Immersion
251 Psychophysics and Music Cognition
253 Symbolic Musical Information
264 Musical Engagement
285 Intermedia Lab
319 Research Seminar on Computational Models of Sound
320B Introduction to Audio Signal Processing Part II: Digital Filters
356 Music and AI
422 Perceptual Audio Coding
451B Neuroscience of Auditory Perception and Music Cognition II: Neural Oscillations

 

 

 

   

CCRMA
Department of Music
Stanford University
Stanford, CA 94305-8180 USA
tel: (650) 723-4971
fax: (650) 723-8468
info@ccrma.stanford.edu

 
Web Issues: webteam@ccrma

site copyright © 2009 
Stanford University

site design: 
Linnea A. Williams