Deep Waveform Synthesis

Date:

Thu, 05/30/2019 - 5:30pm - 7:00pm

Location:

CCRMA Class Room [Knoll 217]

Event Type:

DSP Seminar

Abstract: Conventional audio synthesis (TTS, voice conversion, enhancement, etc) often relies on acoustic feature representations (spectrogram, MFCC, F0, etc.) and a signal processing procedure that infers the waveform from these features. However, such procedures often introduce artifacts caused by insufficient information in the feature representation (e.g. iSTFT without the correct phase info) and/or an oversimplified synthesis process (e.g. a source-filter model). Recent advancements battle this problem using deep learning: WaveNet, for example, generates the waveform sample-by-sample based on acoustic features and previously generated samples using a dilated convolutional net. This new way of synthesis opens a gate to end-to-end and high-quality audio synthesis that sounds almost real. In this talk, I will introduce some of the most notable deep waveform synthesis methods from the past three years and discuss the intuition behind them as well as future directions.

Bio: Zeyu is a research scientist at Adobe Research in San Francisco. His research interests are in speech and music synthesis, deep learning, and human-computer interaction. He received a Ph.D. degree in computer science from Princeton University advised by Adam Finkelstein and M.S degree in music technology from Carnegie Mellon University. Between 2015 and 2017, he interned at Adobe three times and presented his branding research project – VoCo – at Adobe MAX Sneaks (link to video) in 2016.

FREE

Open to the Public

Calendar

Search this site:

Fall Courses at CCRMA

Music 1A Music, Mind, and Human Behavior
Music 101 Introduction to Creating Electronic Sounds
Music 192A Foundations in Sound Recording Technology
Music 201 CCRMA Colloquium
Music 220A Foundations of Computer-Generated Sound
Music 223A Composing Electronic Sound Poetry
Music 256A Music, Computing, and Design I: Software Paradigms for Computer Music
Music 319 Research Seminar on Computational Models of Sound Perception
Music 320 Introduction to Audio Signal Processing
Music 351A Research Seminar in Music Perception and Cognition I
Music 451A Auditory EEG Research I

Main menu

Secondary menu

Deep Waveform Synthesis

Search this site:

Fall Courses at CCRMA