Center for Computer Research in Music and Acoustics (CCRMA)
Stanford University

Navigate this deck with keyboard or touch

type "m" for the menu of pages and
arrow or page keys to advance or rewind

two-finger tap for the menu of pages and
the usual left/right swipe to advance or rewind

embedded media clips (audio and video) use an extra click (or swipe) to begin playing
and another to finish

Probing Rhythmic Synchronization
in our Mind’s Ear



It belongs to the essence of perception not only that it has in view a punctual now and not only that it releases from its view something that has just been, while 'still intending' it in the original mode of 'just-having-been', but also that it passes over from now to now and, in anticipation, goes to meet then new now. The waking consciousness, the waking life, is a living-towards, a living that goes from the now towards the new now.

E. Husserl, On the Phenomenology of the Consciousness of Internal Time (1893–1917)

Husserl's time diagram.
in B. Dainton, Temporal Consciousness, Stanford Enc. Of Phil. (2010)

E. Husserl, Die 'Bernauer Manuskripte' über das Zeitbewußtsein (1917-1918)

...time presented itself to Husserl as the best candidate for the self-manifestation of consciousness: the ordering of past-present- future as a phase-continuum is not in and of itself the manifestness of time, but the lived tension of its passage, the upsurge of its movement, the being-lived-through (Erleben) [experience] of whatever order may otherwise belong to time.

J. Dodd, Reading Husserl’s Time-Diagrams from 1917/18 (2005)

Interlocking rhythm used to test the effect of temporal separation.

Subjects = students and staff at Stanford
(paired randomly)

Task = play rhythm accurately, keep an even tempo
(no strategies given)

Experimental setup.

(3ms delay each way,
metronome cue = mm94)

Delays tested.

(78ms delay each way,
metronome cue = mm90)

Online survey testing a simple adaptive synthetic tapper using Web Audio

synthetic (blue) and a human tapper (red) undergo a meltdown because of a long delay condition
(post-trial comment from the humnan) "This was really fun to do. That D beat was really crazy to follow. This would make a cool game. Thank you."

A micro-time look at leading and lagging.

Onset times, synchronization points and tempo curves for one trial.
(66ms delay each way, metronome cue = mm94)
A smoothed tempo curve is derived from the instantaneous tempi of both player's synchronized events.

Tempo curves of all trials grouped by delay.

0 msec RTT delay                6 msec RTT delay               40 msec RTT delay
Coupled behavior from Web Audio project (coupled synthetic tappers).

Average tempo acceleration vs. delay (coupled human clappers).

Human tempo acceleration and coupled oscillator model.
J.P. Cáceres, Synchronization in Rhythmic Performance with Delay, PhD Thesis, 2013.

Iran R. Roman, et al. Delayed feedback embedded in perception-action coordination cycles results in anticipation behavior during synchronized rhythmic action: A dynamical systems approach, PLOS Computational Biology, 2019.

Reading ahead

Planning ahead

Vamping automatically

...imply "thinking in sound"

"thinking in sound"



...auditory imagery


"thinking in sound"

...implies "thinking in time"

As performers and listeners, we are good examples of creatures who live in the "specious present."

The ways in which we think ahead in time, for example reading a bar ahead in a score we're playing or anticipating upcoming chord changes or tracing the overall arch of a phrase we're building in a jazz improv, these are all examples of planning and thinking in sound.

William James (following Kelly, and followed by Husserl and others) contributed descriptions of temporal flow in the mind.

We are constantly aware of a certain duration—the specious present—varying from a few seconds to probably not more than a minute, and this duration (with its content perceived as having one part earlier and another part later) is the original intuition of time.
William James, The Principles of Psychology (1890)

The practically cognized present is no knife-edge, but a saddle-back, with a certain breadth of its own on which we sit perched, and from which we look in two directions into time. The unit of composition of our perception of time is a duration, with a bow and a stern, as it were—a rearward- and a forward-looking end.
William James, The Principles of Psychology (1890)

It seems likely that some of the difference between a mechanistic model using two interacting synthetic clappers and two human clappers will reside in how such flows are engaged in performance.

Chris will now stop talking.

A different voice will be used as we continue the presentation.

It is the sound of your inner voice which is what you are hearing as you read these words.

How would you rate the clarity of the imagined sound?

How would you rate the clarity of the imagined sound?
  Perfect,  Clear,  Moderate,  Vague,  No image

adopted from Vividness of Visual Imagery Questionnaire
(VVIQ Scale)
D. Marks, Visual Imagery Differences in the Recall of Pictures Brit. J. Psych (1973)

Is it located inside your head?

How 100 Mechanical Turk workers rated clarity...

...and location inside head.

Apparently, this is not harmful to Mechanical Turk workers.
good study, very interesting

This was a very interesting hit. It is one that really makes you listen and concentrate, thanks. It was very interesting to take this survey!

Is there any way to find out what this study is for? This was interesting to do.

This was an interesting task. I've never been asked to do anything quite like it before. This was fun to complete.

I found it interesting to not be able to imagine my voice coming from a spot outside myself or any voice saying mechanical turk including that of my mother. I could hear her voice in my mind but could not imagine or hear her saying mechanical turk. Odd and I wonder why that is.


very interesting study... a subject I've never thought about but makes complete sense.

Mechanical Turk is an Amazon micro-task dispensing service.

Surveys go out in parallel and can reach large numbers of subjects.

Definitely a different type survey. It was fun. Thanks for being creative!


This was a weird but interesting hit. I had never thought about this aspect of hearing or inner voices. I will work with it a bit and see what happens. I realized while doing it than depending on where my focus is, I can hear the sound/voice in different manners- in my head- my voice or a different voice, outside my head like someone else is speaking, in my head like someone else is speaking, in my head with the voice I usually hear speaking to me- whatever I think I can make physically happen.

i did my best, but some of the questions were not entirely clear to me.

As I began the task, I was able to imagine hearing my voice. Then as the task went on, it became more difficult to hear anything, because there was no sound.

This was interesting. I have very good hearing and often hear things other people don't. However, I wasn't very good at imagining a friend's voice or a sound coming from outside my head.

This was an unusual but very interesting study. It is almost like separating body and soul.

It's a fascinating topic and I found it a little difficult to describe in words, so having checkboxes was really helpful.

Daniel Schmicking poses the question "Is there imaginary loudness?" in an essay on phenomenological method. The article has merit for its critique of approaches and because its conclusions suggest openings for further investigation. He proposes that "agreement procedures" could be developed to confront the former.

"Cooperating phenomenologists then should be able to decide whether there is always quasi-loudness in auditory imagery, even if they could not claim apodicticity."

He makes a strong assertion that there is indeed imaginary loudness (quasi-loudness).
Schmicking, Daniel. 2005. Is there imaginary loudness? Reconsidering phenomenological method. Phenomenology and the Cognitive Sciences 4: 169–182.

2014 study with 50 AMT workers involved perceived and imagined loudness (quasi-loudness).

1) play back a voice recorded at reference level: "Amazon, Amazon..."

2) compare levels of a "mystery" test sound to that of the voice

Perceived: mystery test sound loudness vs. sum of judgements of loudness compared to pre-recorded voice
judgements were ranked -1 (quieter) 0 (same) 1 (louder)

1) attend to your inner voice reading: "mechanical turk, mechanical turk..."

2) remember mystery test sound levels and compare to your inner voice

Perceived vs. Imagined: Compared to perceived scale, the imaginary (quasi-) loudness scale is compressed and shifted.
(44 subjects kept, 6 rejected due to incomplete or inconsistent responses)

1) again, read: "mechanical turk, mechanical turk..."
2) imagine a mystery test sound at the same level
3) play the mystery test sounds in the browser and select the one closest to the imagined

Imagined and Perceived: locates "center" of imagined sound with respect to perceived

Future composition project: synthesize "inner hearing"
to play with imagined scales mapped onto perceived (for loudness, spatial, etc.)

On an individual level, I'm certain that the greater fraction of my own auditory experience is, in fact, internal. Modes of hearing and listening vary continually -- perceived, imagined, unconsciousness, conscious, reflex-triggering, enacting -- and it's those which take place on the proscenium of the imagination which are the least studied.

Surveys will attempt to characterize mental sound objects, object "chunking" and temporal flow with a particular goal of examining the "retention / protention" qualities named by Husserl. In these studies, browser-presented sounds and music, and inner voice will again figure in the designs – possibly engaging mental counting as a way of accessing and "time-stamping" events and objects in the "near now."

"Interaction synchrony" survey, starting with tapping to a beat.
2017 web audio examples

(click below) to open https://ccrma.stanford.edu/~cc/mturk17/

Straight metronome (blue)

Simple, adaptive synthetic tapper (red)      Human tapper (red)

adaptive synthetic tapper coupled with human tapper (above)
metronomes (below)

0 msec RTT delay                6 msec RTT delay               40 msec RTT delay
synthetic tapper (blue) coupled with human tapper (red)

synthetic tapper (blue) coupled with human tapper (red), 78 msec delay

Two adaptive algorithms together form a recursion

What is "now" if human time isn't clock time?

Effects of Auditory-Feedback Delays and Musical Roles on Coordinated Timing Asymmetries in Piano Duet Performance
Auriel Washburn, Matthew Wright & Takako Fujioka
CogSci Conf 2017

tempo curves vs. delay

About my Web Audio coding:

The 2017 tapper project uses JavaScript async / await and
Faust dsp code generation (asmjs)

It used ScriptProcessorNode (which is now deprecated)

A new version would use the new Audio Worklet functionality for tighter timing.

With thanks to

Stéphane Letz, Greg Niemeyer, Juan Pablo Cáceres, Hongchan Choi, Steve Chafe, Rob Hmailton

John Granzow, June Holtz, Pauline Oliveros, Andrea Halpern, David Huron, Chryssie Nanou, Jonathan Berger, Jieun Oh

Andy Stuhl, Music 220 classes

Auriel Washburn, Matthew Wright,Takako Fujioka

deck.js Caleb Troughton

and plenty of others!

this slide deck available at