François Germain: Towards practical source-independent algorithms using nonnegative matrix factorization

Date: 
Wed, 04/23/2014 - 5:15pm - 7:00pm
Location: 
CCRMA Classroom, The Knoll 2nd floor, Rm 217
Event Type: 
Colloquium
Limitations of nonnegative matrix factorization (NMF) were recently circumvented through the development of "universal source models" which exploit the similarities inside a given class of sources in order to eliminate the need for user-provided training data. The resulting system is unsupervised from the user perspective which strongly improves its range of practical use. This method was applied to applications such as offline speech enhancement, voice activity detection and singing voice separation. 

Full abstract:
Nonnegative matrix factorization (NMF) is a well-known technique in the field of sound and music processing. Primarily applied to source separation, it was also extended to applications such as pitch detection or music transcription. One of the limitations of NMF resides in the need for training data relative to at least one of the sources in the signal in order to get a deployable system working without user assistance. This limitation was recently circumvented through the development of "universal source models" which exploits the similarities inside a given class of sources in order to eliminate the need for user-provided training data. The resulting system is unsupervised from the user perspective which strongly improves its range of practical use. This method was applied to applications such as offline speech enhancement, voice activity detection and singing voice separation. More recently, it was also integrated successfully in an online speech enhancement framework. In addition, recent results regarding the convergence properties of NMF-based algorithms have brought to light potential gains in speed and performance.
This presentation will cover the details of those recent developments along with experimental results and a discussion on the convergence properties of those algorithms.

Bio:
François Germain is a 3rd-year PhD student at the Center for Computer Research in Music and Acoustics (CCRMA), under the supervision of Julius O. Smith III. He holds a Master of Arts in Music Technology from McGill University (Montreal, QC). His current research interests include machine learning, audio signal processing, nonlinear system modeling, and sound field rendering.
Open to the Public
Syndicate content