Evolutionary Music

From CCRMA Wiki
Revision as of 23:44, 26 April 2021 by Gdrtodd (Talk | contribs)

Jump to: navigation, search

Week 2 Update

This week was mostly about settling on a feasible and well-defined project. I'm interested in both evolutionary models and birdsong, so I've been leaning towards something that can combine both. The tough thing when it comes to designing an evolutionary model of music production is coming up with a fitness function. A genetic algorithm needs access to some way to map a genotype to a fitness value, and it's not clear how this can be done when a genotype is a piece of music. What makes good music? Between two snippets, how can you decide which is better?

Well, one thing you can do is ask how similar the piece of music is to pieces of music you know are good! There are more quantitative metrics of similarity than there are of quality. But rather than measure similarity directly, you could also get to it by asking how easy it is to tell the generated piece from one of the good pieces. This is the logic behind GANs, or generative adversarial networks. A GAN consists of two systems: a generator and a discriminator. The job of the generator is to take a set of data and try to produce more examples of that dataset. For instance, you might feed the generator a punch of Picasso paintings and ask it to generate more. The job of the discriminator is to take one of the generator's outputs alongside an example from the original dataset and try to determine which is which. The two systems encourage each other to get better, and eventually the goal is to have the performance of the discriminator fall to just random chance. At that point, there's no way to tell (from the computer's perspective, at least) the generated outputs from the original dataset. The canonical GAN is used on images, but Stanford's very own Chris Donahue helped develop the WaveGAN, which works over raw waveforms.

I'm interested in seeing if I can get WaveGAN to work for birdsong. I'm also interested in seeing if evolving the weights of WaveGAN (using the discriminator error rate as the fitness function) can achieve comparable results to the canonical learning algorithm, but given the finicky-ness of GAN training, that might need to be a project for another day.

In the next week I'm hoping to lock down a dataset of birdsong (there seem to be multiple options!) and start digging into the WaveGAN paper. I haven't ever worked with GANs before, so I'm looking forward to it!

Week 3 Update

Just starting off the week with a bunch of links related to birdsong that I'm following.

-A paper on an evolutionary model of birdsong

-A paper modeling syringeal vibrations in songbirds

-An old MUSIC220A song on birdsong

-A pitch tracing application (for cleaning up birdsong field recordings)

-Bill Schottstaedt's page on music generation (including birdsong)

-A Csound synthesizer for birdsong

-Csound python examples

-Matlab implementation of birdsong simulation

-Snd homepage

-Synthetic bird sounds dataset

As should be clear from the mountain of links immediately above, I spent the majority of Week 3 doing research. After reading through the WaveGAN paper, I realized that it wouldn't be the best fit for the project. WaveGAN outputs actual waveforms, so in order to remain computationally feasible it's capped at 1 second of audio, which isn't enough to simualte some of the more interesting bird calls. Not to mention, the authors already tested their algorithm on birdsong!

So I started looking for some kind of a synthesizer that would (ideally) have a small number of parameters. This took an unfortunately large amount of time. I kept bouncing back and forth between a paper I'd found on evolving birdsong that lacked any sort of documentation except for a pointer to an outdated Csound synthesizer, an implementation in Matlab using differential equations that I couldn't really parse, and the work of CCRMA's own Bill Schoedstadt from a number of years ago. It took some doing to get Scheme successfully installed onto my laptop, but ultimately I got Bill's configuration to work! Now comes the hard part: digging into the code to understand which parameters do what, and how each species of bird could be rendered as some kind of computational genotype.

Also of note, my dad pointed me towards a paper on latent variable evolution which seems super relevant to my project and was written by my soon-to-be-mentor at NYU. It's a small world!

Week 4 Update

This week was spent familiarizing myself with Scheme (in general) and Bill Schoedstadt's code (in specific). Unfortunately there isn't much to show or write about this, but I do want to shout out to Chris Chafe for walking me through some code line by line!

In terms of how to render the birdsong as a genotype, I'm again running into a bit of a snag. Bill's bird synthesizers, while amazingly high-fidelity, are extremely heterogenous. Very little code is re-used from one synthesizer to the next, making it very difficult to decompose the code enough to use genetic programming. So at the moment I'm leaning a bit towards using a simple genetic algorithm to output parameters to replace the defaults in the synthesizer. But this runs into another problem! The set of parameters that make a good fit is extremely dependent on the specific synthesizer into which they're inputted: a set of parameters that sounds great as a loon could sound awful as a robin. This makes evaluating the fitness of any given genotype very challenging. So to combat this I'm considering using a clustering approach to first bucket the 150-or-so bird recordings into a few manageable buckets. That way, hopefully the genetic algorithm can be reasonably certain that it's parameters will sound good for any synthesizer in the cluster. But to be honest, I have no idea if it will work!