Jump to Navigation

Main menu

  • Login
Home

Secondary menu

  • [Room Booking]
  • [Wiki]
  • [Webmail]

Speech Denoising with Deep Feature Losses

Submitted by francois on Sat, 07/07/2018 - 11:02am
TitleSpeech Denoising with Deep Feature Losses
Publication TypeJournal Article
Year of Publication2018
AuthorsGermain, F. G., Q. Chen, and V. Koltun
JournalarXiv:1806.10522
Date Published06/2018
Type of ArticlearXiv eprint
Abstract

We present an end-to-end deep learning approach to denoising speech signals by processing the raw waveform directly. Given input audio containing speech corrupted by an additive background signal, the system aims to produce a processed signal that contains only the speech content. Recent approaches have shown promising results using various deep network architectures. In this paper, we propose to train a fully-convolutional context aggregation network using a deep feature loss. That loss is based on comparing the internal feature activations in a different network, trained for acoustic environment detection and domestic audio tagging. Our approach outperforms the state-of-the-art in objective speech quality metrics and in large-scale perceptual experiments with human listeners. It also outperforms an identical network trained using traditional regression losses. The advantage of the new approach is particularly pronounced for the hardest data with the most intrusive background noise, for which denoising is most needed and most challenging.
Code Audio examples

URLhttps://arxiv.org/abs/1806.10522
Refereed DesignationNon-Refereed
Full Texthttps://arxiv.org/pdf/1806.10522
  • Tagged
  • XML
  • BibTex
  • Google Scholar
  • Home
  • News and Events
    • All Events
      • CCRMA Concerts
      • Colloquium Series
      • DSP Seminars
      • Hearing Seminars
      • Guest Lectures
    • Event Calendar
    • Events Mailing List
    • Recent News
  • Academics
    • Courses
    • Current Year Course Schedule
    • Undergraduate
    • Masters
    • PhD Program
    • Visiting Scholar
    • Visiting Student Researcher
    • Workshops 2020
  • Research
    • Publications
      • Authors
      • Keywords
      • STAN-M
      • Max Mathews Portrait
    • Research Groups
    • Software
  • People
    • Faculty and Staff
    • Students
    • Alumni
    • All Users
  • User Guides
    • New Documentation
    • Booking Events
    • Common Areas
    • Rooms
    • System
  • Resources
    • Planet CCRMA
    • MARL
  • Blogs
  • Opportunities
    • CFPs
  • About
    • The Knoll
      • Renovation
    • Directions
    • Contact

Search this site:

2021 Spring Quarter Courses

Music 70 Stories and Music of Refugees
Music 220A
Fundamentals of Computer-Generated Sound
Music 220C Research Seminar in Computer-Generated Music
Music 223D Sound Practice: Embodiment and the Social
Music 250A Physical Interaction Design
Music 254 Computational Music Analysis (CS275B)
Music 257 Neuroplasticity and Musical Gaming
Music 285 Intermedia Lab
Music 320C Software Projects in Music/Audio Signal Processing
Music 424 Signal Processing Techniques for Digital Audio Effects

 

 

 

   

CCRMA
Department of Music
Stanford University
Stanford, CA 94305-8180 USA
tel: (650) 723-4971
fax: (650) 723-8468
info@ccrma.stanford.edu

 
Web Issues: webteam@ccrma

site copyright © 2009 
Stanford University

site design: 
Linnea A. Williams