Spatial Audio Literature

Steve Shepard
December, 1994
Presented at the CCRMA Hearing Seminar

This document contains brief summaries of the technical references on spatial audio that I've found useful in my research. It is not exhaustive, but is at least representative of the literature. Most references are drawn from either the Journal of the Acoustical Society of American or the Journal of the Audio Engineering Society of America.

Localization

Blauert (1969) is an often referenced paper detailing median plane experiments that proved that IID and ITD cues were not the only cues that influence localization.

Much of the early work on localization is detailed extensively in Blauert (1983).

Spatial Audio Synthesis

Anechoic Environments

Wenzel (1992), Martens (1992), and Kendall & Martens (1984) provide good summaries of results since the publication of Blauert's book. Wenzel focuses on results that derive from Wightman & Kistler's experiments. Kendall & Martens focus on the Northwestern experiments. These experiments are described in more detail in the papers listed below.

Wightman & Kistler (1989a, 1989b) detail spatial synthesis experiments using HRTF data collected from real subjects in anechoic environments. They present data on the subject's localization performance in both free-field and synthesized (over headphones) environments.

Wenzel (1993) reports the results of experiments with non-individual HRTFs. HRTFs from various subjects in the Wightman & Kistler study were used to synthesize spatial cues for different subjects.

Han (1994) analyzes HRTFs recorded the KEMAR dummy head. He points out the significant elevation and azimuth spectral cues and compares his results with those reported elsewhere in the literature, particularly Blauert (1969).

Gardner (1994) presents the measurement details of HRTF data collect from the KEMAR dummy head. This data is available on the net.

Martens (1987) and Wightman et. al. (1992) examine the use of principal components analysis of HRTF data. Principal components analysis is a useful technique for smoothing the HRTF data while retaining the salient features.

Natural and Virtual Environments

Hartmann (1983, 1985) addresses the question of how reflections in room affects localization ability through experiments conducted in the IRCAM hall.

Kendall & Martens (1984, 1986, 1988, 1990) talk about their work in creating "spatial reverberation" from acoustic models of rooms. They create spatial cues for the first dozen first and second order reflections using the image model.

Begault (1992) examines the perceptual effects of spatial reverberation, presenting data on localization ability in the virtual environments described by Kendall and Martens.

Platt (1994) is a 3DO patent on 3D sound synthesis. The approach uses simple components (notch filter, a simple reverberator, and doppler shift) to similar moving virtual sounds.

Begault (1991) examines the challenges of successfully implementing spatial audio technology. Types of localization error and potential causes are discussed.

Spatial Audio Reproduction

Cooper & Bauck (1989) and Bauck & Cooper (1992) present methods for cancelling the acoustical crosstalk that occurs when binaural data is played back over speakers. One of the models they use is the spherical head model that is described in Cooper (1982)

Salva (1990) argues in favor of near-field listening when monitoring transaural (crosstalk cancelled) audio.

References

Blauert, J. (1969). "Sound localization in the median plane,"Acustica 22, 205-213.

Blauert,Spatial Hearing: The Psychophysics of Human Sound Localization (MIT Press, Cambridge, MA, 1983)

Bauck, J & Cooper, D.H. (1992), "Generalized transaural stereo," Proc. 93rd AES Conference, San Francisco, October 1992.

Begault, D.R. (1991). "Challenges to the successful implementation of 3-D sound,"Journal of the Audio Engineering Society, 39 (11), 864-870.

Begault, D.R. (1992) "Perceptual effects of synthetic reverberation on three-dimensional audio systems,"Journal of the Audio Engineering Society, 40 (11), 895-904.

Cooper, D.H. (1982). "Calculator program for head-related transfer functions,"Journal of the Audio Engineering Society, 30 (1/2), 34-38.

Cooper, D.H., & Bauck, J.L. (1989). "Prospects for transaural recording,"Journal of the Audio Engineering Soc., 37 (1/2), 3-19.

Gardner, B, & Martin, K. (1994). "HRTF measurements of a KEMAR dummy-head microphone," MIT Media Lab Perceptual Computing Technical Report #280.

Han, H.L. (1994), "Measuring a dummy head in search of pinna cues," Journal of the Audio Engineering Society, 42 (1/2), 15-37.

Hartmann, W.M. (1983) "Localization of sound in rooms,"Journal of the Acoustical Society of America, 74, 1380-1391.

Hartmann, W.M. (1985) "Localization of sound in rooms II: The effects of a single reflecting surface,"Journal of the Acoustical Society of America, 78, 524-533.

Jullien, J.P., et. al. "Spatializer: a perceptual approach".

Kendall, G.S., Martens, W.L., et. al. (1986). "Image model reverberation from recirculating delays," Presented at the 81st Convention of the Audio Engineering Society, pre-print #2408.

Kendall, G.S., Martens, W.L., & Wilde, M.D. (1990). "A spatial sound processor for loudspeaker and headphone reproduction," AES 8th International Conference.

Kendall, G.S., & Martens, W.L. (1984). "Simulating the cues of spatial hearing in natural environments,"ICMC `84 Proceedings, 111-125.

Kendall, G.S., & Martens, W.L. (1988). "Spatial reverberator," U.S. Patent number 4,731,848.

Martens, W.L. (1987). "Principal components analysis and resynthesis of spectral cues to perceived direction,"Proceedings of the 1987 International Computer Music Conference, San Francisco, CA, 274-281.

Martens, W.L. (1992). "Demystifying spatial audio," presented at the 3D Media Technology Conference, Montreal.

Platt, D.C. (1994). "Method for generating three dimensional sound," U.S. Patent number 5,337,363.

Wenzel, E. (1992). "Localization in virtual acoustic displays," Presence, 1 (1), 80-107.

Wenzel, E., Arruda, M., Kistler, D.J, & Wightman, F.L. (1993) "Localization using non-individualized head-related transfer functions,"Journal of the Acoustical Society of America, 94(1), 111-123.

Wightman, F.L., & Kistler, D.J. (1989a) "Headphone simulation of free-field listening I: Stimulus synthesis,"Journal of the Acoustical Society of America, 85(2), 858-867.

Wightman, F.L., & Kistler, D.J. (1989b) "Headphone simulation of free-field listening II: Psychophysical validation,"Journal of the Acoustical Society of America, 85(2), 868-878.

Wightman, F.L., & Kistler, D.J. (1992) "A model of HRTFs based on principal compo- nent analysis and minimum-phase reconstruction," Journal of the Acoustical Society of America, 91(3), 1637-1647.

Salava, T. (1990) "Transaural stereo and near-field listening," J. Aud. Eng. Soc., Vol. 38, pp. 40-41