Malcolm Slaney on Pitch Change Recognition

Fri, 01/24/2014 - 11:00am - 12:30pm
CCRMA Seminar Room
Event Type: 
Hearing Seminar
There are a couple of recent studies that endeavor to understand how to measure pitch changes, *without* measuring pitch! I would like to summarize these ideas and discuss their implications for perception.

Pitch changes are important for prosody and some languages. Singers care about pitch, but most people speak and perceive speech without conscious understanding of the pitch. Yet this signal is important. For most languages it tells us a lot about the non-semantic information about the speech signal. And tone languages use pitch to indicate different words. It seems important to measure pitch.

The two studies I have been involved in have taken two different approaches to avoiding the pitch signal. Last year my colleagues and I at Microsoft Research looked at measuring pitch changes directly, without first measuring the pitch. This turns out to be a more robust signal than the basic pitch. More recently, friends at the LDC (Linguistic Data Consortium) have studied if we can classify the tone of an utterance using the MFCC representation, which is “known” to throw away the pitch information. They were successful.

How can you measure pitch changes and tone without measuring pitch? Come to CCRMA to find out more!


BSEE, MSEE, and Ph.D., Purdue University. Dr. Malcolm Slaney is a principal scientist at Microsoft Research (Silicon Valley). He is a Consulting Professor at Stanford CCRMA, where he has led the Hearing Seminar for more than 20 years, and an Affiliate Faculty in the Electrical Engineering Department at the University of Washington. He is a Fellow of the IEEE and (former) Associate Editor of IEEE Transactions on Audio, Speech and Signal Processing and IEEE Multimedia Magazine. He has given successful tutorials at ICASSP 1996 and 2009 on “Applications of Psychoacoustics to Signal Processing,” on “Multimedia Information Retrieval” at SIGIR and ICASSP, and “Web-Scale Multimedia Data” at ACM Multimedia 2010. He is a coauthor, with A. C. Kak, of the IEEE book Principles of “Computerized Tomographic Imaging”. This book was republished by SIAM in their “Classics in Applied Mathematics” Series. He is coeditor, with Steven Greenberg, of the book Computational Models of Auditory Function. Before joining Microsoft Research, Dr. Slaney has worked at Bell Laboratory, Schlumberger Palo Alto Research, Apple Computer, Interval Research, IBM’s Almaden Research Center, and Yahoo! Research. For many years, he has lead the auditory group at the Telluride Neuromorphic (Cognition) Workshop. Dr. Slaney’s recent work is on understanding conversational speech in addition to general audio perception.

Open to the Public
Syndicate content