AI and Music

There’s a fundamental question that we need to ask ourselves: what is the relationship between AI and our species? Should AI serve us, or the other way around? Or should we exist to serve each other? There is no way for us to know what we want from AI music systems without first answering this question.

I hold that any scenario where we serve AI without any reciprocity is unacceptable; the moment this becomes reality is the moment our species loses its own agency and defers the shaping of its future to machine-logic gods, external to our wants and desires. Thus, I believe that AI should exist to help humans gain more control over their own agency, whether by serving us directly or by a symbiotic relationship with real reciprocity.

Given this, to me, the true value that comes out of the AI music systems that have been developed recently is the degree and granularity of control that they can provide their users when used, and the speed at which they can empower a user to fully realize their vision, no matter their expertise level in making music.

Numerous forms of media, such as film and games, and the music industry too, are developing to the point where large studios become composed of multiple teams of people specializing on different aspects of a work of art. To me, this is not an ideal situation to be in for the ones behind the vision. Vision can be watered down due to communication overhead, as naturally happens large teams of people working toward a common goal.

This is what excites about language-conditioned DL systems like MusicLM; natural language is a modality that can allow artists to more fully express what they want out of a tool because of language’s rich complexity. When the sliders you have access to in your DAW aren’t giving you the sound that you want, the LM will be able to read your text description and generate what you ask for.

Do I think that we’re there yet? No. At least, not fully. Although MusicLM has lots to be praised for, such as its great ability to generate coherent beat patterns and sequences, and its high audio fidelity, it does not do a good job of generating singing. However, I do not doubt in my mind that multiple different teams of AI researchers will work on improving on these aspects.

There is the question of how these systems will impact artists who work to commission others. Do I think that these AI systems will replace them? Not really. When the camera was created people still had demands for paintings. There will still be demands for artists. Even if they did though, it’s a capitalism problem, not an AI problem.

In short, what I want from AI music systems is to bring us a step closer into a world where everyone is an auteur. I want these systems to let everybody build out their own dream instead of building someone else’s.