Matt Wright, 2/11/4 through 3/8/4
For various systems I want to know the total latency between making a physical gesture and getting the resulting sound. So far I have tested these systems:
I synthesized a band-limited impulse train (using CLM's sum-of-cosines unit generator with this code). I play the exact same sound out of two channels, with one plugged directly into a recorder and the second going in the test computer's audio input, through whatever software configuration I want to test, out the test computer's audio output, and into the second channel of the recorder.
Then I use this matlab program to mark the local maxima of each signal and find the audio latency. I attribute the measured jitter to the fact that these bandlimited impulses are not exactly impulses and tend to be flat at their maxima for 3-5 samples.
Here is a picture of my experimental setup:
As you can see, the microphone is pointed directly at the keyboard, so as to record the acoustic sound of physically hitting the keys. For this experiment I tried to strike each key quickly and sharply with a fingernail so as to produce the most impulsive possible sound. The microphone was placed just a few centimeters from the location of the striking of the keys, so as to be able to ignore the speed of sound propagation.
This signal from the microphone goes into the left input of the omni i/o audio interface. The sound output of the powerbook (or other device) goes into the right input of the same audio interface. Thus, my sound file has the "stimulus" in the left channel and the "response" in the right channel. I can then look at it in a sound editor and "see" the latency, like this:
At first I manually marked the times of the stimuli and responses in a sound editor by eye. Here is a page describing this method and the results. Now I use a matlab program to find the events automatically and measure the latencies.
As you can see from the above images, the question of when the trigger event happened is somewhat subjective. In other words, the waveform of the sound of hitting the keys isn't just a huge impulse at the instant that the key was struck. I ended up deciding to use the local amplitude maximum of the "stimulus" signal as the instant of the trigger.
The other side of the problem was much easier, since I have complete control over what sound the computer will generate when it detects a keypress. I used this trivial Max/MSP patch (which you can download) to produce a 1K sinusoid with an amplitude envelope that goes instantly to full volume at the start of each note and then decays quickly. Since the frequency is 1K, the period is 1ms, so any error from not knowing the initial phase will be less than a millisecond.
For the drum machine I selected the most impulsive sample, a closed hi-hat.
My matlab program simply looks for the first sample above the noise floor and calls that the instant of the response.
Here's how it works:
I triggered about 20 notes for each experiment and measured the latency with the matlab script described above. These graphs are histograms of the latency for each note for each experimental condition. The red cross shows the mean (by its horizontal position) and standard deviation (by the width of the cross bar); there is no information in the vertical position.
Space bar, Core Audio built-in sound, iovs = sigvs = 64, Overdrive off, latencytest1.wav
Space bar, Core Audio built-in sound, iovs = sigvs = 64, Overdrive on, SIAI off, latencytest2.wav
Space bar, Core Audio built-in sound, iovs = sigvs = 64, Overdrive on, SIAI on, latencytest3.wav
'n' key, Core Audio built-in sound, iovs = sigvs = 64, Overdrive on, SIAI on, latencytest4.wav
Wacom tablet, Core Audio built-in sound, iovs = sigvs = 64, Overdrive on, SIAI on, latencytest5.wav
Space bar, Rimas Box, iovs = sigvs = 64, Overdrive on, SIAI on, latencytest6.wav
Wacom tablet, Rimas Box, iovs = sigvs = 64, Overdrive on, SIAI on, latencytest7.wav
Kawai R-100 drum machine, Hi Hat trigger pad, latencytest8.wav