Audio Style Transformations using Deep Neural Networks by Prateek Verma

Date:

Fri, 01/12/2018 - 10:30am - 12:00pm

Location:

CCRMA Seminar Room

Event Type:

Hearing Seminar

Deep Neural Networks (DNNs) have been wildly successful for many tasks, but none of the tasks are as wondrous as the success that DNNs have had on transferring the style of one painting to another painter’s art. This magical trick is accomplished by mixing and matching the low-level feature analysis layers between different styles of paining, so a painting in one style is rendered in another with different kinds of brush works.

But can we do this for audio? Prateek Verma has been experimenting with this, and will talk about his work and results. What does audio style mean, and how does one capture it?

Who: Prateek Verma (Stanford CCRMA)
What: Audio Style Transformations using Deep Neural Networks
When: 10:30AM on Friday, January 12, 2018
Where: CCRMA Seminar Room
Why: Style transfer is cool and mysterious

This is the first Hearing Seminar of the new quarter. Bring your own style, and we’ll talk about how DNNs can change it.

- Malcolm

Audio Style Transformations using Deep Neural Networks
Prateek Verma and Julius Smith - Stanford CCRMA

There has been fascinating work on creating artistic transformations of images by Gatys et al. This was revolutionary in how we can in some sense alter the “style” of an image while generally preserving its “content”. In our work, we present a method for creating new sounds using a similar approach, treating it as a style-transfer problem, starting from a random-noise input signal and iteratively using back-propagation to optimize the sound to conform to filter-outputs from a pre-trained neural architecture of interest.

For demonstration, we investigate two different tasks, resulting in bandwidth expansion/compression, and timbral transfer from singing voice to musical instruments. A feature of our method is that a single architecture can generate these different audio-style-transfer types using the same set of parameters which otherwise require different complex hand-tuned diverse signal processing pipelines. We would also discuss very similar work published by Google Research on style transfer on speech, with content with words being spoken and style being the speaker. Finally to conclude, we would motivate plethora of applications possible with this framework, with simple tweaking of the loss functions. :)

Biography
Prateek Verma is a CCRMA MA/MST graduate interested in Audio Processing, Generation and Analysis. Before coming to Stanford, he graduated from IIT Bombay in Electrical Engineering. He has held research positions at Stanford in Artificial Intelligence Lab in the Computer Science Department.

FREE

Open to the Public

Calendar

Search this site:

Spring Quarter 2024

Music 101 Introduction to Creating Electronic Sounds
Music 128 Stanford Laptop Orchestra (SLOrk)
Music 155/255 (ARTSTUDI 239) Intermedia Workshop
Music 220C Research Seminar in Computer-Generated Music
Music 222A Quantum Computer Music
Music 228 SVOrk (Stanford Virtual Reality Orchestra)
Music 250A Physical Interaction Design for Music
Music 254 Computational Music Analysis
Music 257 Neuroplasticity and Musical Gaming
Music 319 Research Seminar on Computational Models of Sound Perception
Music 320C Audio DSP Projects in Faust and C++
Music 423 Graduate Research in Music Technology

Main menu

Secondary menu

Audio Style Transformations using Deep Neural Networks by Prateek Verma

Search this site:

Spring Quarter 2024