Your Attention Please: Three important aspects of auditory attention
At the next CCRMA Hearing Seminar, I'd like to talk about recent work on auditory attention. As many of you know, I'm very interested in how our brains process sounds in complicated environments. Our ability to understand any speech in a noisy cacophony is known as the Cocktail Party Effect. We still don't have good models of how are brains do this task, but an important component of the process is certainly attention.
I'd like to talk about three important aspects of attention that might help explain how attention mediates our ability to solve the cocktail party problem. 1) Most fundamental is a measure of auditory saliency. How can we build more realistic models of what makes a sound pop out of the background. 2) How can we decide to what sound a subject is attending? I'd like to talk about some recent work at the Telluride Cognition Engineering Workshop to build a real-time audio attention decoder using EEG signals. 3) How do we model the entire feedback loop, connecting exogenous and endogenous attention into a working system. These are all really critical issues for understanding the cocktail party problem.
Who: Malcolm Slaney (Microsoft Research - Conversational Systems Research Center) and CCRMA
What: Auditory Attention Studies to Model the Cocktail Party Effect
When: Friday November 30 at 1:15PM
Where: CCRMA Seminar Room - Top Floor of the Knoll at Stanford
Why: Because attention is critical to understanding a cacophony of sounds
You bring your favorite ears, and I'll provide the exogenous stimulus to keep your attention! See you at CCRMA.
- Malcolm
P.S. On Dec. 7, Nils Peters from ICSI in Berkeley will be talking about his work to identify the characteristics of a room from an audio recording.
Your Attention Please Three studies of attention for modeling the cocktail party effect
By Malcolm Slaney
Microsoft Research - Conversational Systems Research Center
Stanford CCRMA (Consulting) Professor
Attention is a critical process in understanding the complicated audio environments in which we operate. There are multiple sounds competing for our attention, and somehow our auditory systems picks out the necessary sounds to understand at least one of the speakers. I'd like to talk about three aspects of attention we've been studying: 1) how to model the saliency of rich sounds, 2) how to model the top-down and bottom-up attentional signals, and 3) how we can use EEG signals to capture information about which sounds a subject is attending to. The last work was done this past summer at the Telluride Neuromorphic Cognition workshop, where we demonstrated real-time auditory attentional decoding, for the first time ever. A subject listened through headphones to two different speakers reading stories. The subject picked one story to attend to, and based on just 60 seconds of the EEG signals we could determine the attended signal with 90% accuracy.
Biography
Malcolm Slaney (Fellow, IEEE) is a Principal Scientist in Microsoft Research’s Conversational Systems Research Center in Mountain View, CA. Before that he held the same title at Yahoo! Research, where he worked on multimedia analysis and music- and image-retrieval algorithms in databases with billions of items. He is also a (consulting) Professor at Stanford University’s Center for Computer Research in Music and Acoustics (CCRMA), Stanford, CA, where he has led the Hearing Seminar for the last 20 years. Before Yahoo!, he has worked at Bell Laboratory, Schlumberger Palo Alto Research, Apple Computer, Interval Research, and IBM’s Almaden Research Center. For the last several years he has helped lead the auditory and attention groups at the NSF-sponsored Telluride Neuromorphic Cognition Workshop. He is a coauthor, with A. C. Kak, of the IEEE book Principles of Computerized Tomographic Imaging. This book was republished by SIAM in their Classics in Applied Mathematics series. He is coeditor, with S. Greenberg, of the book Computational Models of Auditory Function. Prof. Slaney has served as an Associate Editor of the IEEE TRANSACTIONS ON AUDIO, SPEECH, AND SIGNAL PROCESSING, IEEE MULTIMEDIA MAGAZINE, the PROCEEDINGS OF THE IEEE, and the ACM Transactions on Multimedia Computing, Communications, and Applications.