Center for Computer Research in Music and Acoustics (CCRMA)
Stanford University

Navigate this deck with keyboard or touch

type "m" for the menu of pages and
arrow or page keys to advance or rewind

mobile devices: two-finger tap for the menu of pages and
the usual left/right swipe to advance or rewind

embedded media clips (audio and video) use an extra click (or swipe) to begin playing
and another to finish

two reviews of
100 Untitled Works in Mill Aluminum by Donald Judd (1982 — 1986)

“Virtuosity is craft shaped by the imperatives of meaning.”

Jed Perl, Oh, Cool
The New Republic, Sep, 2003

“Amid Judd’s aluminum boxes, I started to entertain the possibility that the meaning of his art wasn’t something that resided just beyond my grasp, but something that lay in the grasping itself.”

Leslie Jamison, The Minimalist Who Wasn't
The Atlantic, Oct, 2020

Jamison continues, “Arguably the beating pulse of the entire compound is an installation—housed in two converted artillery sheds—that comprises 100 boxes made of mill aluminum, whose gleaming surfaces reflect the Texas sky in all its shifting moods. As I stood among them, the glinting lines and angles of the boxes conveyed the precision of their construction and their subtle variations. Some had open walls; some were entirely closed; some were sliced in half by partitions. But the effect of the entire installation was more sweeping, far less tightly controlled, almost dizzying.”

“These aluminum boxes weren’t just boxes. They held the weather itself: clouds swollen with rain, or a horizon painted by the burlesque of sunset. They were cubes made of sky; their faces carved the light into radiant squares. ... Judd’s installations revealed ways of carving up the world that could hold its infinitude rather than stifling it. That’s what these boxes felt like, slices of infinitude, as if light were a creature, and this was one of its natural habitats.”

“The boxes were more dynamic than they appeared, expanding and contracting with changes in the temperature—almost as if they were alive, only in a way we couldn’t see, could barely even recognize. Their sublimity lay on the other side of all my attempts to summon them with language—these habitats of light, cubes of sky, sustained by quiet, metallic respiration. “To see is to forget the name of the thing one sees,” the poet Paul Valéry once said, and those boxes made me forget their names. They brought my sight to life. They asked me to see absence in terms of presence.”

Returning to Jed Perl's article, Oh Cool , “That machine-made art can hold us with its lyric force is clear to me, certainly from the hundred aluminum boxes that are Donald Judd's central achievement at the Chinati Foundation in Marfa, Texas... ”

For the sake of exploration, I am going to pose a question about lyric force or virtuosity ever becoming “virtual.” This prospect could be an engineering feat of grand challenge proportions, or it could be a music-making fantasy of the kind that drives composers to experiment with algorithm design. Or it could ultimately be something that resides primarily in our ears, as I will explain.

I'd been pointed to a hyper version of machine development by one of my teachers, Roger Reynolds years ago, The Practice Effect.

A 2014 Gizmodo piece on The 10 Great Novels That Will Make You More Passionate About Science singles out The Practice Effect. The imagined setting is one, writes Jane Andrews Charlie, in which “...scientists succeed in creating a device that manipulates space and time — and they're able to use it to travel to another planet, which is very similar to Earth. Except on this other planet, the second law of thermodynamics works differently: Objects don't get worn out, and in fact get stronger the longer they're used.”

Let's have some sound examples. Here are three which I'll play again later on.

A catchy melody which involved no human musicians nor moving parts synthesized in 1984.

(30 sec)

A disklavier solo which is actually a duo with an algorithm recorded in 2006.

(15 sec)

A stream of audio samples at 16kHz which are the result of feeding raw audio from human performances into a neural network in 2016.

(35 sec)

I've been fortunate to have teachers who encouraged my interest in music and science and their intersections. Roger Reynolds' book Mind Models came out in 1975. What I present here started out as a contribution to a festschrift compiled just ahead of when Mind Models was updated with a second edition in 2005. I've taken this opportunity to look back on what I'd written then to illustrate the original points by citing recent projects. It feels slightly like time travel to inhabit thoughts of 17 years ago using my present voice. But here we go.

Autonomous virtuosity.

“Maybe he could turn the Practice Effect to his advantage, though he suspected it would work quite differently for a sophisticated instrument than for an ax or a sled. The very idea was too fresh and disconcerting for the scientist in him to dwell on yet.”

David Brin, The Practice Effect (1984)
(chapter, “The Best Way to Carnegie Hall”)

I've never been able to shake the image of the progress of evolution in Brin's world where it happens much faster and is guided by human need, while myself progressing at a much slower pace with my much less robust algorithms for automatic performance and composition. Perhaps music is one place where Brin's world is naturally the case.

Let's posit: autonomous = detachable,
virtuosity = “craft shaped by the imperatives of meaning”

Deft use of an instrument or the voice develops through practice. The hands and ears of a strong player acquire the ability to skillfully sculpt sound into musical patterns with uncanny temporal precision and acoustical shading, a growth which Frank Wilson, neurologist and hand specialist, says “illustrates the de facto evolution of intelligence taking place right under our noses.”

Frank Wilson, The Hand: How Its Use Shapes the Brain, Language, and Human Culture (1999)

Clara Rockmore, theremin, recorded in 1976

(3 min)

...and again, that deep neural network.

(35 sec)

These milestones are forty years apart. The Nov, 2016 WaveNet example (remixed with apologies) is the result of an unsupervised deep neural network that was essentially allowed to babel musically. Autonomous? -- Yes. Virtuosic? -- That's the question which gets to the point of this presentation. It is a most impressive demonstration of machine learning applied in music synthesis and, at least to this listener's ears, it's a harbinger of new possibilities.

Recorded performances of romantic piano music played in a concert hall provided the training data. That's what went in and what came out was something recognizably romantic- human- piano- concert hall- like. What are its limits? Is it evolving? Is it virtuosic in and of itself and / or is it a tool for human use?


Limits? -- duration. Evolving? -- yes, as a technology. See P. Verma, C. Chafe A Generative Model For Raw Audio Using Transformer Architectures DAFX 2020

Computer models in music increasingly supplant everything from acoustical instruments to sound engineers. Autonomous agents can play well-defined (but mechanical) parts in narrowly-defined musical styles or construct those parts. They can incorporate modules for analytic listening which monitor the musical environment and their contribution to it.

Over time the sophistication of the agents' constituent algorithms inexorably grows. The agents recognized as the first virtuosi will exhibit traits and qualities with which artists are likewise recognized. “Craft” in music equates with chops, that quality which builds with practice, whether performer, composer, luthier or sound engineer, and which makes the musician. Our presumptive virtuosi have their chops intact.

Robotic vehicles, so beautifully exemplified by the Mars rovers, are also “crafty” in their own right, designed in engineering domains where robustness is the equivalent of musical chops. These craft are able to explore semi-autonomously in incredibly remote places. Their independence has evolved over decades as on-board decision making gains in sophistication. The supervisory aspects transmitted to rovers can move to higher and more abstract levels as algorithms running automatically in the system's “brain” become more robust.

This week, from NASA: “When we as human beings look at moving images of the ground, such as those taken by Ingenuity’s navigation camera, we instantly have a pretty good understanding of what we’re looking at. We see rocks and ripples, shadows and texture, and the ups and downs of the terrain are relatively obvious. Ingenuity, however, doesn’t have human perception and understanding of what it’s looking at. It sees the world in terms of individual, anonymous features – essentially dots that move around with time – and it tries to interpret the movement of those dots.”

“To make that job easier, we gave Ingenuity’s navigation algorithm some help: We told it that those features are all located on flat ground. That freed the algorithm from trying to work out variations in terrain height, and enabled it to concentrate on interpreting the movement of the features by the helicopter’s movements alone. But complications arise if we then try to fly over terrain that isn’t really flat.”

Monday, 5 July 2021, “Flight 9 Was a Nail-Biter...”

Compare Wilson's description from The Hand, “This evolution does not represent a change in biologic intelligence, but the establishment of a culturally defined and valued form of intelligent behavior through early and intense educational manipulation and subsequent rewarding of musicianship, both with special incentives for success and severe penalties for failure.”

Wilson is interested in the plasticity of the brain and that a musician's maturation is an expression of cultural progress.

“Their musical development is subject to circumstances so unusual and extreme that professional musicians are actually evolving as we watch. It is a virtual new species, however, because the information controlling the new musical intelligence and skill is imbedded in musical institutions and has no effect on the genetic composition of living individuals.”

This is a two-way street, since the set of norms and institutions is plastic too, a result of so many individuals' gifts back to culture. Music produces virtuosi in continuous streams. The sequences of teachers, and students who become teachers, form braided, merging and diverging schools, worldwide. Teachers cross tens of generations when charting, for example, the gharana of sarod or tabla on the Indian subcontinent.

Sakhavat Hussein Khan (1936) and the Lucknow-Shahjahanpur Gharana Archive, 2019 trailer plus Irfan Khan

(2 min)

Such histories emerge from deep time and are continuously evolving. Passed on from the teacher are both craft and a way of communicating meaning. Added by each individual is new meaning to be folded into the musical style. The folding-in is at the crux of virtuosity. Again, I would emphasize that this is a two-way street.

In 2014 Reza Shajarian visited our center to experiment with Hagia Sofia acoustics via live convolutional reverb

(2 1/2 min)

There are also rovers in Brin's story which arrive ahead of humans and scout his world in much the same way as we now have a presence on Mars. The Practice Effect works its improvements and we get a glimpse of sharpened agents whose algorithmic abilities transcend “robustness.”

The initial point-of-view of the story's human protagonist clouds his appreciation of the phenomenon such that his partner scout seems to be going about its normal business. As events unfold, he learns that his limited expectation kept him from seeing the robot's enhanced capabilities. He discovers that it possesses extremely advanced decision making and an ability to communicate useful (meaningful) insights about the strange planet they are both destined to explore.

Point-of-view also constrains music listeners and by extension the ability of musical “institutions” to fold-in newness. If autonomous agents developed virtuosity, we wouldn't know it unless our point-of-view permits us to. In other words, as a robust chops-rich agent-performer acquires “imperatives of meaning,” unless we are in a receptive state that communication will be ignored. Kind of like saying that we can't play our role in a Turing test unless we're listening to start with. Thinking back on this, institutions can be clouded in their appreciation of such phenomena evolving in even human form, too.

Adapting our point-of-view so that we can hear liberally should be easier than answering the next question, which concerns the nature of the added substance and how it gets there. For players, there exist essences that drive their best playing. Some call it spirit, others zones of being, or resonances. No engineering spec to be found there, and our agent factory is going to be on standby for a long time before it completes a purchase order for quantities of spiritual essence. One hint about this added “input” comes from the body.

James Thompson plays Hendrix's Voodoo Chile

(2 min)

Wilson asks, “What might happen if the body no longer were to define, limit, or even help `calibrate' the brain's continuing experiment to expand its reach?” Breaking free of the body is not without reference to it. We could ask computer-mediated instruments to afford the same possibilities, that designs for human / computer interfaces both couple with and transcend the body's constraints.

The musical “brain” is a larger concept, encompassing somatic extensions to the limbs and even the musical instruments themselves. Exogenous improvements have shaped the histories of musical meaning. Virtuosic instruments (“gesture amplifiers”) have evolved along with the skill of their performers.

Designs in musical synthesis along physical lines have already shown a dual nature of being grounded in known territory but free to escape. Virtuosic agents would be “practiced” by exercise in a world of synthetic “praxis” while their authors are listening and waiting for those moments of flight.

Automatically Improvised Hot Jazz, IRCAM (1984)

My violin physical model performer system was hooked up to the output of a melody improvisation generator. Using a set of rules designed by Philippe Gautron and André Hodeir, the algorithm created a violin line to fit a set of chord changes that were given as input. Pay no attention to the guitar's complete lack of expression.

Doug James lab, Stanford Computer Science

(1 min)

study, Generalized Adversarial Network (GAN)

Noah Berrie, “as a medium of sorts for computer-processed human expression”
(15-Jul, 2021)

(30 sec)

Klaus Kinsky (Herzog's Aguirre, Wrath of God, 1972 -- 15 sec)

Leslie Jamison, “Perhaps the sense of yearning I felt whenever I looked at Judd’s art wasn’t a sign that I was failing to encounter it. Instead of expressing Judd’s “particular feeling at the time,” these boxes made room for another kind of feeling instead—the energizing vertigo of figuring out how to approach beauty without the comfortable framework of a story line, of allowing it to speak to me subcutaneously, beneath the figurative skins of sense and symbolism.”

Roberto Morales, disklavier piano “duo” with live improv algorithm

excerpt from Morales, Mitchell & Chafe Trio, second set (7-Oct,2006)

(1 1/2 min)

I'd like to close by paraphrasing a line from Brin in his chapter “Ballon d'Essai.”

“What couldn't our world of sound and music creators accomplish, simply by using the Practice Effect on a sophisticated little machine like that?” Maybe, autonomous designs will simply be detachable.

Thank You!

this deck is available at

the original 2004 text is at