Regaip "Rego" Sen

Regaip "Rego" Sen - 220c Final Project:
Procedure for HRTF Individualization by Stimuli Orientation Response (PHISOR)

My project calibrates head related transfer functions (HRTFs) for individuals by playing spatialized test tones in the following procedure (a good analogy is an eye exam for finding one's prescription):

STEP 1: The listener hears a sound stimulus projected in a specific direction (i.e. behind the listener, below the listener) using an HRTF. The HRTF is slightly modified and the stimulus is played a second time, now using the modified HRTF.

STEP 2: The listener is asked which stimulus sounds further / more clearly in the intended direction (i.e. "which tone sounds further behind you?") and responds by selecting either tone or stating that he/she can't tell.

STEP 3: Steps 1 and 2 are repeated for several iterations as the parameters are more finely-tuned to the individual based on his/her responses.

STEP 4: Steps 1-3 are repeated for both axes: forward/behind and above/below. I think the default parameters for the left/right axis are good enough for all ears.

CONTRIBUTIONS:

I have built a program to filter monophonic sounds using a parametric HRTF described in this paper by Brown and Duda. The program is runnable from a Linux terminal on the CCRMA network (go to /usr/ccrma/snd/regosen/HRTF). You may also download and expand the following tar file. The program is built in C++ and uses STK libraries.

Here is the code for spatialization.
Here is the code for calibration.

The inputs to the filter are quarter-second impulses of white noise, generated by the following MATLAB code:

fs=44100;
bits=16;
for i=1:9,
t=rand(1,floor(fs/4))-.5;
tname=strcat(['t' num2str(i) '.wav']);
wavwrite(t,fs,bits,tname);
end

Here is an mp3 of a spatialized excerpt of "Raw Silk Suite" by the Indian Fusion Jazz trio Ambika. The sax is spatialized above me, the bass is toward the back and the drum set surrounds my head; the spatialization used an HRTF spatialized to my ears, so it may not work as well for you.

HOW IT WORKS:

The structural model of the Brown/Duda HRTF function is shown below (taken from their paper). The model also included delay lines from shoulder echo, but they proved to be inconsequential to spatialization.

The first block for each ear in the diagram represents a transfer function for the head shadow. The Roman lowercase "a" represents head radius.

The next block represents delay for the head shadow:

The set of parallel gains and delays represent the effect of the pinnae. Brown and Duda found that only 6 lines are needed for each pinna, with the A, B, and D coefficients found experimentally and listed in the paper.

The coefficients for D were the only values that varied significantly from person to person, and this is what I varied for most of the HRTF calibration procedure (the head radius was also varied at one stage of the process). The calibration begins with two sets of HRTFs that use measured coefficients from the Brown and Duda study. When the listener favors one of the two samples, the D coefficients of the other are re-centered at those of the favored HRTF and then tweaked at a variance that decreases with each successive round.

PROGRAM INSTRUCTIONS (taken from my README file):

STEP 1. How to get your own calibrated HRTF profile
STEP 2. How to hear demos using your profile
	i. Azimuth demo
	ii. Elevation demo
	iii. Ambika demos (using MATLAB)
STEP 3. How to spatialize your own samples


STEP 1. How to get your own calibrated HRTF profile:
----------------------------------------------------

At the terminal, type

> make-hrtf

and follow the instructions.  You will be prompted for 
a name, which I will refer to as USERNAME for the rest 
of the procedures that follow.  Then you will hear 
pairs of noise bursts spatialized behind you, followed 
by pairs spatialized above you.  Once the calibration 
program is finished, you will find your HRTF profile 
in the profiles/ subdirectory.



STEP 2. How to hear demos using your profile:
---------------------------------------------

i. Azimuth demo: type the following at the prompt:

 > azimuth USERNAME

You will hear 18 bursts of white noise.  The first one 
is spatialized in front of you, the second will be 20 
degrees to the right, and so on until the bursts have 
traversed a full circle.


ii. Elevation demo: type the following at the prompt:

 > elevation USERNAME

Again, you will hear 18 bursts of white noise.  The 
first one is spatialized in front of you, the second 
will be 20 degrees below, and so on until the bursts 
have traversed a full circle.  DISCLAIMER: The 
formulae used in this HRTF came from a paper which 
admits to poor spatialization in the vertical axis.


iii. Ambika demos: this directory contains two MATLAB 
functions that will create spatialized mixes of jazz 
trio excerpts.  (Thanks to Harvey Thornburg for his 
expertise in MATLAB.)  The mixes will be stereo .WAV 
files placed in the output/ subdirectory.  To read 
more information about the functions (and how to run 
them) type the following at the MATLAB prompt:

 help hrtfDemoShort
 help hrtfDemoLong

To run the functions, type:

 hrtfDemoShort('USERNAME')
 hrtfDemoLong('USERNAME')

These demos take several minutes to process, so you 
might want to run the short one if you don't have a 
lot of time.  Also, try to avoid running these on 
slow computers.


STEP 3. How to spatialize your own samples:
-------------------------------------------

In this directory is the executable file "hrtf", 
which should be run with the following syntax:

> hrtf INPUT OUTPUT DURATION AZIMUTH ELEVATION PROFILE

INPUT is a mono WAV file to be spatialized
OUTPUT is a stereo WAV file to be written
DURATION is the desired length of the output
    (good for creating excerpts from long pieces)
AZIMUTH in degrees, ranging from -90 (L) to 90 (R)
ELEVATION in degrees, coordinated as follows:
    +/- 180 = back
	-90 = below
	  0 = front
	 90 = above
PROFILE is your .hrtf file

Here is an example of the syntax:
> hrtf in.wav /zap/out.wav 1.5 -20 60 rego.hrtf

Other approaches/sources considered:

I first considered making use of the HRTF measurements provided by Bill Gardner and Keith Martin using an averaged model of the human ear.
Here are two relevant spatialization articles by Jean-Marc Jot: 1 2
Here is a link to the pdf description of the CIPIC HRTF database.
Here is a thesis by Corey I Cheng (UMich) on using visualization techniques to analyze and process HRTF data.
I was very excited to find this abstract by Eric Durant (UMich) on generating IIR approximations of HRTFs from measured impulse responses. Unfortunately, I don't think it evolved any further than that. I will email him about this.
This hearing test contains the most convincing examples of 3D audio I've heard thus far.

Regaip "Rego" Sen - 220c Final Project: Procedure for HRTF Individualization by Stimuli Orientation Response (PHISOR)

Regaip "Rego" Sen - 220c Final Project:
Procedure for HRTF Individualization by Stimuli Orientation Response (PHISOR)