Programs and book for building an audio coder and for deep learning for audio

Date:

Fri, 10/07/2022 - 3:30pm - 4:30pm

Location:

CCRMA Classroom [Knoll 217]

Event Type:

Hearing Seminar

This quarter's Hearing Seminar is dedicated to "What I did during my pandemic?" First up is Prof. Gerald Schuller talking about the book and code he wrote to build better audio coders and deep neural networks. This is a joint seminar with the DSP seminar so it will take place at 3:30PM in the classroom.

Abstract:
Audio coding became an ubiquitous tool for transmitting and storing audio signals, for instance as part of high quality teleconferencing, like with "Facetime" or similar, or for listening for music in almost every way. A recent tool for adaptive audio processing is deep learning, which so far is mostly used for image and speech processing, but increasinglich also for audio processing. All of those can be implemented and experimented with in Python, which allows for fast prototyping and also is the progamming language for deep learning.
This talk will present in a way “what I did during my pandemic”, Python tools and examples for building an audio coder, and examples for my tutorial on "Deep Learning for Audio". These tools are in public Github repositories together with Python Colab notebook descriptions, which is described in my new book on "Filter Banks and Audio coding - Compressing Audio Signals Using Python" (Springer), and in videos on YouTube.
Links to correspinding repositories are:
https://github.com/TUIlmenauAMS/AudioCoding_Tutorials
https://github.com/TUIlmenauAMS/Python-Audio-Coder
https://github.com/TUIlmenauAMS/AES_Tutorial_2021

Short Bio:
Gerald Schuller is a full professor at the Institute for Media Technology of the Technical University of Ilmenau, since 2008. He was head of the Audio Coding for Special Applications group of the Fraunhofer Institute for Digital Media Technology in Ilmenau, Germany, since January 2002 until 2008, and is now a member of Fraunhofer IDMT. Before joining the Fraunhofer Institute, he was a Member of Technical Staff at Bell Laboratories, Lucent Technologies, and Agere Systems, a Lucent Spin-off, from 1998 to 2001. There he worked in the Multimedia Communications Research Laboratory. He received his Diplom degree in Electrical Engineering from the Technical University of Berlin in 1989, and his Ph.D. (Dr.-Ing.) degree from the University of Hanover in 1997, studied at the Massachusetts Institute of Technology in 1989/90 and at the Georgia Institute of Technology in 1993. He was Associate Editor of the IEEE Transactions on Speech and Audio Processing from 2002 until 2006, and the IEEE Transactions on Signal Processing from 2006 to 2009, and of the IEEE Transactions on Multimedia from 2008 to 2011. He is recipient of the 2006 IEEE Best Paper Award in the Audio and Electroacoustics Area. His research interests are in filter banks, audio coding, music signal processing, and deep learning for multimedia. He is probably best known for his work on low delay filter banks, which became part of the MPEG-4 ELD-AAC audio coding standard, which is now part of the iOS and Android operating systems and is used for instance in the Facetime application.

FREE

Open to the Public

Calendar

Search this site:

Spring Quarter 2024

Music 101 Introduction to Creating Electronic Sounds
Music 128 Stanford Laptop Orchestra (SLOrk)
Music 155/255 (ARTSTUDI 239) Intermedia Workshop
Music 220C Research Seminar in Computer-Generated Music
Music 222A Quantum Computer Music
Music 228 SVOrk (Stanford Virtual Reality Orchestra)
Music 250A Physical Interaction Design for Music
Music 254 Computational Music Analysis
Music 257 Neuroplasticity and Musical Gaming
Music 319 Research Seminar on Computational Models of Sound Perception
Music 320C Audio DSP Projects in Faust and C++
Music 423 Graduate Research in Music Technology

Main menu

Secondary menu

Programs and book for building an audio coder and for deep learning for audio

Search this site:

Spring Quarter 2024