Tim Hankins

Spatial Interpolation Using Binaural Convolution and Amplitude Modulation






We know that impulse responses can be used to quantify acoustic environments, but is the process also applicable to binaural sytems?
Background:
An impulse can be approximated by a very short burst of sound, like that made by a cap gun. Ideally, the impulse should contain every frequency within the audible range, and at equal amplitudes. This condition can be though of as a sort of "control group", in the sense that we know exactly what defines the characteristics of the sound.

If we then use the impulse to excite an acoustic environment, the resulting sound is a detailed description of the Amplitude and Phase of all of the reflections imparted to the impulse by the room. In other words, the resulting sound provides a sort of temporal/spectral fingerprint of the room's reverberation.

Once the room's characteristics are captured as a wav file, we can use it to effect dry sounds by using the process of convolution - seen in HW #7.

Binaural recording is the process of placing very small microphones near - or within - the ear. Doing this allows the Head Related Transfer Function of the recordist to filter the ambient sound before it is recorded. The filtering is done by virtue of the fact that the head, shoulders, and ears obstruct - or otherwise change - the sound that reaches each microphone. Once recorded, binaural sounds can be listened to with headphones and give a very life-like reproduction of the spatial cues that originally exitesd.
Implementation:
I made a pair of binaural microphones by using the techniques seen earlier in the semester, with the exception that I built two mics and mounted them on an old pair of glasses. The impulses were generated using two wooden clappers, and were recorder by a protable minidisc recorder.

A total of five impulses were taken in the hallway downstairs. Carr Wilkerson - who was kind enough to act as the recording engineer - stood in the center of the hall and was oriented such that the hallway streched out on either side, L & R. Two impulse responses were taken from each side - one at about 40 feet and one at about 20 - and one impulse was taken directly in front of the listening position.

These files were then dumped into SND and edited. The editing simply involved removing unnecessary silences and noise. The following images show the time and frequency domain responses of each of the five impulses.

Far Left Near Left Center Near Right Far Right

Once edited, I used the "convolve-with" function within SND to perform the convolution of my source file with each of the five impulse responses. The SCM file can be seen below.

(open-sound"editedMandolin_leftCH.wav")
(convolve-with "Hallway_L_F_leftCH.wav" 0.4)

(open-sound"editedMandolin_rightCH.wav")
(convolve-with "Hallway_L_F_rightCH.wav" 0.4)

;-----------------------------------------------------------

(open-sound"editedMandolin_leftCH.wav")
(convolve-with "Hallway_L_C_leftCH.wav" 0.4)

(open-sound"editedMandolin_rightCH.wav")
(convolve-with "Hallway_L_C_rightCH.wav" 0.4)

;-----------------------------------------------------------

(open-sound"editedMandolin_leftCH.wav")
(convolve-with "Hallway_R_F_leftCH.wav" 0.4)

(open-sound"editedMandolin_rightCH.wav")
(convolve-with "Hallway_R_F_rightCH.wav" 0.4)

;-----------------------------------------------------------

(open-sound"editedMandolin_leftCH.wav")
(convolve-with "Hallway_R_C_leftCH.wav" 0.4)

(open-sound"editedMandolin_rightCH.wav")
(convolve-with "Hallway_R_C_rightCH.wav" 0.4)

;-----------------------------------------------------------

(open-sound"editedMandolin_leftCH.wav")
(convolve-with "Hallway_Center_leftCH.wav" 0.4)

(open-sound"editedMandolin_rightCH.wav")
(convolve-with "Hallway_Center_rightCH.wav" 0.4)
	       
Below you can see the result of the convolution of the five distince impulses with the source file:
Far Left Near Left Center Near Right Far Right


Results:
Though not absolutely convincing, the sense of loalization created by the process of binaural convolution is quite good, provided that the playback is on headphones.

Although each of the five sounds exhibit a good sense of spatial location, each location is static, i.e. it doesn't move. That made me wonder whether it would be possible to create a convincing sensation of movement by somehow interpolating between the five impulse responses. To test the idea, I used SND's "envelope" generator to crossfade between each of the files. This means simply aligning the five files in time, and then smoothly fading one file into the next. The crossfaded files can be seen below.
Far Left Near Left Center Near Right Far Right



Each of the five crossfaded files is then mixed together - using SND's "mix" function. The resulting file gives a good illusion of the source moving from left to right.

Assuming that a better impulse generator and recording medium could be found, I think that it would be relatively easy to trace any path in three dimensional space using impulses. If the sampling interval - the distance between any two impulses along the path - were also shortened a greater resolution of movement could be obtained, and potentially, a process could be developed where someone might trace out paths with an impulse generator, record the results, then convolve them with a source file using a batch process.