The Precipice of the Uncanny Valley

Applying Genetic Algorithms to Timbre Mimicking

What is timbre mimicking?

Timbre mimicking is the development of algorithms for imitating the timbre of existing sounds. It's often used to recreate known sounds, like those of acoustic instruments, using new synthesis techniques.

The algorithm

I use genetic algorithms to mimic the timbre of a target sound using granular synthesis with a source sound. The algorithm breaks the source sound into short grains. Then, it rearranges them randomly in order to generate a population of organisms — estimate solutions that approximate the target sound. The organisms are evaluated for how well they mimic the target sound, or how high their fitness is. Organisms that perform well are kept, mated, and mutated over time. After many generations, the population evolves to have a high fitness, and the organisms mimic the target timbre very closely.

Read more about genetic algorithms.
See the algorithm's implementation in ChucK.

The aesthetic

I am framing the results of this algorithm in the aesthetic of the uncanny valley. The uncanny valley relates the degree to which something is human or machine to our emotional response to it. In general, we respond more positively the more human something becomes, but objects that are not quite human evoke a sharp negative reaction instead.

Using genetic algorithms, this project aims to thoughtfully create new timbres that exist along a continuous timbre space between humanity and machinery. The uncanny valley aesthetic grounds the algorithm in this purpose. It provides direction for design choices I make about the inner workings of the algorithm and about the kinds of inputs I feed it. Reciprocally, the algorithm allows me to demonstrate the existence of the uncanny valley in timbre!

Read more about the uncanny valley.

Sound Examples

Here are a few examples of the output of the genetic algorithm. In each example, a different source sound is used to mimic this target violin sound:

This demo uses a recording of a human voice:

This demo uses a complex synthesized note:

This demo mimics the target with itself:

This demo uses a longer source sound with multiple pitches. In general, the more grains the algorithm has to choose from, and the more variation there is in the grains, the longer the algorithm takes to arrive at a plausible solution. This example still manages to get pretty far in just 1000 generations:

About genetic algorithms.
About the uncanny valley.