Tao Zhang on joint attention decoding and speech enhancement

Date:

Fri, 05/10/2019 - 10:30am - 12:00pm

Location:

CCRMA Seminar Room

Event Type:

Hearing Seminar

Machine learning methods have opened up new frontiers in both understanding our brain and measuring what we perdeive, as well as enhancing speech using deep models of speech to allow us to remove the noise. Tao Zhang will be at the Hearing Seminar on Friday May 10th to talk about both of these approaches.

Who: Tao Zhang (Starkey Laboratories)
What: A Joint Attention Decoding and Adaptive Beamforming Optimization Approach to the Cocktail Party Problem
When: Friday, May 10th at 10:30AM
Where: CCRMA Seminar Room, Top Floor of the Knoll at Stanford
Why: Cool technologies to let us hear what we want to hear

Our auditory world is complicated, and we now have more computer power than we dreamed possible in our ear buds. Can we use this computer power to improve our auditory experiences? Come to CCRMA to find out more.

Title:
A Joint Attention Decoding and Adaptive Beamforming Optimization Approach to the Cocktail Party Problem

Talk Abstract:
The cocktail party problem has remained to be one of the most difficult problems for hearing devices even after decades of extensive research. One of the key challenges is to determine the desired talker in a cocktail party. Recently, researchers have successfully demonstrated the decoding of auditory attention using EEG, MEG or EMG (i.e. [1][2][3]). In addition, several research studies have attempted to incorporate the decoded auditory attention information into speech enhancement solutions (i.e. [4][5]). However, the existing solutions are less optimal in the sense that auditory attention decoding is often separate from speech enhancement. In this talk, we propose a joint auditory attention decoding and multi-channel speech enhancement approach. The proposed approach eliminates the need of speech envelope of each talk, which is a difficult problem in practice by itself. Furthermore, the proposed solution is optimal in the sense that the attended talker’s speech is optimized using both microphone inputs and EEG inputs in a united framework. We presents preliminary results to demonstrate the effectiveness of the algorithm and discuss future research directions.

Speaker Bio:
Tao Zhang received his B.S. degree in physics from Nanjing University, Nanjing, China in 1986, M.S. degree in electrical engineering from Peking University, Beijing, China in 1989, and Ph.D. degree in speech and hearing science from the Ohio-State University, Columbus, OH, USA in 1995. He joined the Advanced Research Department at Starkey Laboratories, Inc. as a Sr. Research Scientist in 2001, managed the DSP department from 2004 to 2008 and the Signal Processing Research Department from 2008 to 2014. Since 2014, he has been Director of the Signal Processing Research department at Starkey Hearing Technologies, a global leader in providing innovative hearing technologies. He has received many prestigious awards including Inventor of the Year Award, the Mount Rainier Best Research Team Award, the Most Valuable Idea Award, the Outstanding Technical Leadership Award and the Engineering Service Award at Starkey.

He is a senior member of IEEE and the Signal Processing Society and the Engineering in Medicine and Biology Society. He serves on the IEEE AASP Technical Committee and the industrial relationship committee and the IEEE ComSoc North America Region Board, He is an IEEE SPS Distinguished Industry Speaker and the Chair of IEEE Twin-cities Signal Processing and Communication Chapter.

His current research interests include audio, acoustic, speech signal processing and machine learning; multimodal signal processing and machine learning for hearing enhancement, health and wellness monitoring; psychoacoustics, room and ear canal acoustics; ultra-low power real-time embedded system design and device-phone-cloud ecosystem design. He has authored and coauthored 120+ presentations and publications, received 20+ approved patents and had additional 30+ patents pending.

FREE

Open to the Public

Calendar

Search this site:

Spring Quarter 2024

Music 101 Introduction to Creating Electronic Sounds
Music 128 Stanford Laptop Orchestra (SLOrk)
Music 155/255 (ARTSTUDI 239) Intermedia Workshop
Music 220C Research Seminar in Computer-Generated Music
Music 222A Quantum Computer Music
Music 228 SVOrk (Stanford Virtual Reality Orchestra)
Music 250A Physical Interaction Design for Music
Music 254 Computational Music Analysis
Music 257 Neuroplasticity and Musical Gaming
Music 319 Research Seminar on Computational Models of Sound Perception
Music 320C Audio DSP Projects in Faust and C++
Music 423 Graduate Research in Music Technology

Main menu

Secondary menu

Tao Zhang on joint attention decoding and speech enhancement

Search this site:

Spring Quarter 2024

Main menu

Secondary menu

Tao Zhang on joint attention decoding *and* speech enhancement

Search this site:

Spring Quarter 2024

Tao Zhang on joint attention decoding and speech enhancement