The purpose of this research is to compare Roy Patterson's gammatone filter-bank model of pitch perception with the Meddis-Lyon cochlear model that combines autocorrelation with automatic gain control. Malcolm Slaney has coded each model with MATLAB so that given a signal input, the codes output a representation of hair cell responses.
(see Appendix B for results of interest) Patterson's three experiments cited in his 1987 article from MRC Applied Psychology compared auditory perception of waves with modified-phase harmonics to waves with zero-phase harmonics. The control stimuli were called Constant-Phase, or CPH. Each set of experiments was conducted using fundamental frequencies of 62.5 Hz, 125 Hz, 250 Hz, and 500 Hz. The three experiments determined if humans can distinguish CPH from the following phase-modified stimuli:
In JASA's July 2003 issue, Carlyon and Shamma argued that auditory models should not discard across-channel phase information. In defense of this argument, they presented a model that accounts for this information and analyzed its response to stimuli from past phase-perception experiments. The experiments and their results of interest are as follows:
Shihab and Shamma analyzed the output of their model via spectrogram using a linear approximation to the action of higher central auditory stages. While they admit to the limitation of a spectrogram that does not vary with input level (they suggested resolving this problem with cochlear filter banks), graphs showed asynchronous stimuli to generate greater excitement among the auditory stages than synchronous stimuli. The exception was the Craig and Jeffress stimuli: their model responded to both synchronous and asynchronous stimuli. They attributed this to the fact that since the higher tone was the first harmonic of the other, their synchronous output generated large phase transitions at peaks.
In addition to their own model, Shihab and Shamma tested other models with the same stimuli. Meddis's autocorrelation model did not respond to the asynchronies among Carlyon and Shackleton's groups of unresolved harmonics or Yost and Sheft's envelopes of AM tones. Roy Patterson's model gave the same (correct) response for Craig and Jeffress's stimuli.
Aside from the limitation of their spectrogram, Shihab and Shamma noted that the cochlear filters in their model are broader than their physiological counterparts. Their proposed solution is to reintroduce a lateral inhibitory network, which would make the filters more selective than Patterson's gammatone filter.
Regaip Sen has created a series of wave files that duplicate the stimuli used in the Patterson and Craig/Jeffress experiments using LISP. See Appendix A for the code. For the Patterson experiments, the APH stimuli were expanded to those with the following phase shifts:
We can do several things to improve results. First, we need a better way to integrate the cochlear and hair cell responses. Then we need to tune the model bandwidths, which will solve the quantitative inconsistency.First, we need to retest Patterson's model with all the test cases.
Carlyon, R. P., Shackleton, T. M., (1994). "Detecting pitch-pulse asynchronities and differences in fundamental frequency," J. Acoust. Soc. Am. 95, 968-979.
Carlyon, R. P., and Shamma. S., (2003). "An account of monaural phase perception." J. Acoust. Soc. Am. 113, 333-348.
Craig, J. H., and Jeffress, L. A., (1962). "Effect of phase on the quality of a two-component tone," J. Acoust. Soc. Am. 34, 1752-1760.
Goldstein, J. L. (1966)., "Auditory spectral filtering and monaural phase perception," J. Acoust. Soc. Am. 42, 458-479.
Moore, Brian. An Introduction to the Psychology of Hearing, 4th Ed. San Diego: Academic Press. (1997).
Patterson, J. H., and Green, D. M., (1970). "Discrimination of transient signals having identical energy spectra," J. Acoust. Soc. Am. 48, 894-905.
Patterson, R. D., (1987). "A pulse ribbon model of monaural phase perception." J. Acoust. Soc. Am. 82, 1560-1586.
Yost, W. A., and Sheft, S., (1989). "Across-critical-band processing of amplitude-modulated tones," J. Acoust. Soc. Am. 85, 848-857.
Appendix A: LISP code for test cases
;;; USES CM-CLM-CMN LISP LIBRARIES ;;; FOR definstrument AND make-score ;;; To create samples from the CCRMA network, ;;; save the following code as "make-samples.lisp" ;;; then run the following lines from the terminal: ;;; /usr/bin/clisp-cm-clm-cmn ;;; (compile-file "make-samples" :verbose nil) ;;; (load *) ;;; (make-samples-patterson) ;;; (make-samples-craig-jeffress) (definstrument partial (start dur frequency freqskew amplitude freq-envelope amp-envelope phase) (let* ((gls-env (make-env :envelope freq-envelope :scaler (hz->radians freqskew) :duration dur)) (os (make-oscil :frequency frequency :initial-phase phase)) (amp-env (make-env :envelope amp-envelope :scaler amplitude :duration dur)) (len (inexact->exact (round (* *srate* dur)))) (beg (inexact->exact (round (* *srate* start)))) (end (+ beg len))) (run (loop for i from beg to end do (outa i (* (env amp-env) (oscil os (env gls-env))) ))))) (defun make-samples-patterson () (let* ( (x 0) (freq_env '(0 1 1 1)) (amp_env '(0 1 1 1)) (spec_a (make-array 32 :initial-contents '(0 84 64 52 43 37 33 29 26 23 21 20 18 17 16 15 14 13 12 12 11 10 10 9 9 9 8 8 8 7 7 7))) (spec_b (make-array 32 :initial-contents '(0 334 257 208 174 149 130 116 104 94 86 79 73 67 63 59 55 52 49 46 44 42 40 38 36 35 33 32 31 29 28 27))) (spec_c (make-array 32 :initial-contents '(0 1336 1026 830 695 597 522 463 415 375 343 315 290 269 251 235 220 208 196 185 176 167 159 152 145 139 133 127 122 118 113 109))) (spec_d (make-array 32 :initial-contents '(0 5344 4104 3321 2781 2387 2087 1850 1659 1502 1370 1258 1162 1078 1004 940 882 830 784 741 703 668 636 607 579 554 531 509 489 470 453 436))) ) ;;; EACH (with-sound... STATEMENT CREATES SAMPLES FOR 62.5, 125, 250, AND 500 HZ. (loop for hz in '(62.5 125 250 500) do (loop for harm in '(4 8) do (loop for hh in '(0 3 7 15) do ;;; CPH, RPH (with-sound (:srate 44100 :header-type mus-riff :output (format nil "/usr/ccrma/snd/220a-2003/regosen/319/cph~Ah~A-~Aharm.wav" hz hh harm)) (setf x (* hz hh)) (loop for i from hh below (+ hh harm) do (setf x (+ x hz)) (partial 0 0.256 x x .03 freq_env amp_env 0))) (with-sound (:srate 44100 :header-type mus-riff :output (format nil "/usr/ccrma/snd/220a-2003/regosen/319/rph~Ah~A-~Aharm.wav" hz hh harm)) (setf x (* hz hh)) (loop for i from hh below (+ hh harm) do (setf x (+ x hz)) (setf p (* pi (/ (random 360) 180))) (partial 0 0.256 x x .03 freq_env amp_env p))) ;;; APH (5-90 DEGREES, FROM 0/4/8/16th HARMONICS) (loop for ph from 1 below 21 do (with-sound (:srate 44100 :header-type mus-riff :output (format nil "/usr/ccrma/snd/220a-2003/regosen/319/aph~Ap~Ah~A-~Aharm.wav" hz ph hh harm)) (setf x (* hz hh)) (loop for i from hh below (+ hh harm) do (setf x (+ x hz)) (setf p (if (= (/ i 2) (floor (/ i 2))) (/ pi (/ 360 ph)) (- (* 2 pi) (/ pi (/ 360 ph))))) (partial 0 0.256 x x .03 freq_env amp_env p)))) (loop for ph from 65 below 81 do (with-sound (:srate 44100 :header-type mus-riff :output (format nil "/usr/ccrma/snd/220a-2003/regosen/319/aph~Ap~Ah~A-~Aharm.wav" hz ph hh harm)) (setf x (* hz hh)) (loop for i from hh below (+ hh harm) do (setf x (+ x hz)) (setf p (if (= (/ i 2) (floor (/ i 2))) (/ pi (/ 360 ph)) (- (* 2 pi) (/ pi (/ 360 ph))))) (partial 0 0.256 x x .03 freq_env amp_env p)))) (loop for ph from 5 below 185 by 5 do (with-sound (:srate 44100 :header-type mus-riff :output (format nil "/usr/ccrma/snd/220a-2003/regosen/319/aph~Ap~Ah~A-~Aharm.wav" hz ph hh harm)) (setf x (* hz hh)) (loop for i from hh below (+ hh harm) do (setf x (+ x hz)) (setf p (if (= (/ i 2) (floor (/ i 2))) (/ pi (/ 360 ph)) (- (* 2 pi) (/ pi (/ 360 ph))))) (partial 0 0.256 x x .03 freq_env amp_env p)))) ;;; MPH (1/8, 1/2, 2, and 8-SCALAR) (with-sound (:srate 44100 :header-type mus-riff :output (format nil "/usr/ccrma/snd/220a-2003/regosen/319/mph~A_1_8_h~A-~Aharm.wav" hz hh harm)) (setf x (* hz hh)) (setf p 0) (loop for i from hh below (+ hh harm) do (setf x (+ x hz)) (setf p (+ p (aref spec_a i))) (partial 0 0.256 x x .03 freq_env amp_env p))) (with-sound (:srate 44100 :header-type mus-riff :output (format nil "/usr/ccrma/snd/220a-2003/regosen/319/mph~A_1_2_h~A-~Aharm.wav" hz hh harm)) (setf x (* hz hh)) (setf p 0) (loop for i from hh below (+ hh harm) do (setf x (+ x hz)) (setf p (+ p (aref spec_b i))) (partial 0 0.256 x x .03 freq_env amp_env p))) (with-sound (:srate 44100 :header-type mus-riff :output (format nil "/usr/ccrma/snd/220a-2003/regosen/319/mph~A_2_h~A-~Aharm.wav" hz hh harm)) (setf x (* hz hh)) (setf p 0) (loop for i from hh below (+ hh harm) do (setf x (+ x hz)) (setf p (+ p (aref spec_c i))) (partial 0 0.256 x x .03 freq_env amp_env p))) (with-sound (:srate 44100 :header-type mus-riff :output (format nil "/usr/ccrma/snd/220a-2003/regosen/319/mph~A_8_h~A-~Aharm.wav" hz hh harm)) (setf x (* hz hh)) (setf p 0) (loop for i from hh below (+ hh harm) do (setf x (+ x hz)) (setf p (+ p (aref spec_d i))) (partial 0 0.256 x x .03 freq_env amp_env p))) ))))) (defun make-samples-craig-jeffress () (let* ( (freq_env '(0 1 1 1)) (amp_env '(0 1 1 1)) ) (loop for db from 3 below 74 by 10 do (loop for ph in '(0 90) do (loop for dbA in '(40 50 60 70 80 84) do (with-sound (:srate 44100 :header-type mus-riff :output (format nil "/usr/ccrma/snd/220a-2003/regosen/319/exp1/craig~Aa~Ab~A.wav" ph dbA db)) (partial 0 1 250 250 (* .8 (expt 10 (/ -20 dbA))) freq_env amp_env 0) (partial 0 1 500 500 (* .8 (expt 10 (/ -20 db))) freq_env amp_env ph)) (with-sound (:srate 44100 :header-type mus-riff :output (format nil "/usr/ccrma/snd/220a-2003/regosen/319/exp1/craig~Aa~Aib~A.wav" ph dbA db)) (partial 0 1 250 250 (* -.8 (expt 10 (/ -20 dbA))) freq_env amp_env 0) (partial 0 1 500 500 (* -.8 (expt 10 (/ -20 db))) freq_env amp_env ph)))) (loop for ph in '(45 135) do (with-sound (:srate 44100 :header-type mus-riff :output (format nil "/usr/ccrma/snd/220a-2003/regosen/319/exp1/craig~Aa60b~A.wav" ph db)) (partial 0 1 250 250 (* .8 (expt 10 (/ -20 60))) freq_env amp_env 0) (partial 0 1 500 500 (* .8 (expt 10 (/ -20 db))) freq_env amp_env ph)) (with-sound (:srate 44100 :header-type mus-riff :output (format nil "/usr/ccrma/snd/220a-2003/regosen/319/exp1/craig~Aa60ib~A.wav" ph db)) (partial 0 1 250 250 (* -.8 (expt 10 (/ -20 60))) freq_env amp_env 0) (partial 0 1 500 500 (* -.8 (expt 10 (/ -20 db))) freq_env amp_env ph))))))
Appendix B: Results of Roy Patterson's Experiments taken from his article