Anssi Klapuri - From Time-Frequency to Time-Pitch Domain: Psychoacoustic vs. Data-Driven Approach

Date: 
Mon, 03/12/2012 - 11:00am - 12:30pm
Location: 
CCRMA Seminar Room
Event Type: 
Hearing Seminar
I'm very happy to introduce Anssi Klapuri to the Hearing Seminar community. Anssi has done some of the best work in the world on multi-source pitch modeling. Just how do we identify the sources and pitches of a complicated (musical) signal?  Or do we, instead, identify the pitches and then the sources?  It's a hard problem, and the systems to beat all have Anssi's name on them.

    Who:    Anssi Klapuri (Queen Mary, University of London and Ovelin)
    Why:    Multisource pitch perception is hard!!!
    What:    From time-frequency to time-pitch domain: psychoacoustic vs. data-driven approach
    When:    Monday March 12 at 11AM <<<< Note special time!!!
    Where:    CCRMA Seminar Room, Top Floor of the Knoll at Stanford.

We always have a good discussion at the Hearing Seminar.  Lots of voices, and a big multi-pitch separation problem.  Anssi will present his model.

See you Monday morning at CCRMA!!!

- Malcolm
From time-frequency to time-pitch domain: psychoacoustic vs. data-driven approach
 
Human auditory system tends to summarize certain properties of complex sounds with a single frequency value that we call pitch. We do not treat complex sounds as a collection of sinusoids (at least no on a conscious level), but as having one, coherent pitch track. The problem of mapping a sound signal from time-frequency domain to a "time-pitch" domain has turned out to be hard, especially in the case of polyphonic signals where several sound sources are active at the same time. In this talk, I argue for a combined approach for finding such a mapping, on one hand utilizing psychoacoustic knowledge to identify a general model structure that is sufficiently close to the global optimal to solve the problem for practical purposes, and on the other hand, using data-driven parameter learning to find the numerical parameter values in such a generic model. I will also discuss how the time differential of such a time-pitch representation can effectively model the acoustic cues [Bregman1990] that promote the fusion of spectral components to a same sound source.
 
 
Bio
Anssi Klapuri received his Ph.D. degree from Tampere University of Technology (TUT), Tampere, Finland. He visited as a post-doctoral researcher at Ecole Centrale de Lille, France, and Cambridge Univerisity, UK, in 2005 and 2006, respectively. He worked until 2009 as a professor (pro term) at TUT. In 2009 he joined Queen Mary, University of London as a lecturer in Sound and Music Processing. In September 2011 he joined Ovelin Ltd to develop game-based musical instrument learning applications, while continuing part-time at Queen Mary University of London. His research interests include audio signal processing, auditory modeling, and machine learning.
 


FREE
Open to the Public
Syndicate content