Hannes Muesch - Speech Intelligibility

Date:

Fri, 03/10/2023 - 10:30am - 12:00pm

Location:

CCRMA Seminar Room

Event Type:

Hearing Seminar

Continuing this quarter's theme of quality, Hannes Muesch (Dolby) will talk about his approach to measure and predict the intelligibility of speech signals. Speech after all is a very important part of our hearing---we want to maintain speech intelligibility and restore it when needed.

Going all the way back to the start of the telephone network people have been interested in which information is critical for conveying a speech signal. Harvey Fletcher (hi Jont) studied the perception of speech as a function of bandwidth. This led to his Articulation Index, which, Poppy Crum recently suggested at CCRMA, limited high-frequencies and disadvantaged women’s voice in communications. The idea of a model that predicts speech intelligibility was formalized as a standard in the 1990s and was known as the Speech Intelligibility Index (SII). Chas Pavlovic was one of the developers and he will be in attendance. Most recently, our speaker this week, Hannes Muesch, developed the speech recognition sensitivity model, which uses statistical decision theory to model how well can people perceived speech.

This represents a full century of modeling our ability to correctly hear speech. Our discussion will be much shorter, but still highly intelligible.

Who: Hannes Muesch (Dolby)
What: Speech Intelligibility
When: Fri, 03/10/2023 from 10:30am - 12:00pm
Where: CCRMA Seminar Room
Why: Because listening to and understanding speech is really important

We have a tradition of high-quality discussions at the Hearing Seminar. How do we talk about and measure the quality of a speech signal? See you at CCRMA!

- Malcolm

In this meeting we will review the oldest class of speech intelligibility models: Those that make predictions from only the power spectra of speech and maskers. These models are very simple by modern standards but have had, and continue to have, a substantial impact on the design of communication equipment and in hearing health care. We will review the Articulation Index in its various incarnations and will discuss a now 20-year-old model that accounts for intelligibility test results that the Articulation Index models cannot explain.

Dr. Muesch introduces a new model that predicts speech intelligibility based on statistical decision theory. This model, which we call the speech recognition sensitivity (SRS) model, aims to predict speech-recognition performance from the long-term average speech spectrum, the masking excitation in the listener's ear, the linguistic entropy of the speech material, and the number of response alternatives available to the listener. A major difference between the SRS model and other models with similar aims, such as the articulation index, is this model's ability to account for synergetic and redundant interactions among spectral bands of speech. In the SRS model, linguistic entropy affects intelligibility by modifying the listener's identification sensitivity to the speech. The effect of the number of response alternatives on the test score is a direct consequence of the model structure. The SRS model also appears to predict the differential effect of linguistic entropy on filter condition and the interaction between linguistic entropy, signal-to-noise ratio, and language proficiency.

Bio: Hannes Muesch got his doctorate with Soren Buus and Mary Florentine at Northeastern University working on loudness perception and speech intelligibility. That work resulted in the SRS intelligibility model, which will be discussed in this presentation. He has worked at both GN ReSound and Sound ID and is now with Dolby.

See for example this paper.

FREE

Open to the Public

Calendar

Search this site:

Spring Quarter 2024

Music 101 Introduction to Creating Electronic Sounds
Music 128 Stanford Laptop Orchestra (SLOrk)
Music 155/255 (ARTSTUDI 239) Intermedia Workshop
Music 220C Research Seminar in Computer-Generated Music
Music 222A Quantum Computer Music
Music 228 SVOrk (Stanford Virtual Reality Orchestra)
Music 250A Physical Interaction Design for Music
Music 254 Computational Music Analysis
Music 257 Neuroplasticity and Musical Gaming
Music 319 Research Seminar on Computational Models of Sound Perception
Music 320C Audio DSP Projects in Faust and C++
Music 423 Graduate Research in Music Technology

Main menu

Secondary menu

Hannes Muesch - Speech Intelligibility

Search this site:

Spring Quarter 2024