Critical Response: What do you (really) want from AI music generation?

The rise of powerful generative AI models such as ChatGPT and DALL-E has forced a re-evaluation and deeper introspection on what it means to be human and what it is that we want from AI systems. The recent release of MusicLM, an experimental generative AI music model, promises shifts within the music field. While this model may represent an exciting milestone for some, the looming question that must be considered is the purpose of such a system. In the MusicLM paper, the authors state the part of the broader impact of their system is to “extend the set of tools that assist humans with creative music tasks.” A question that arises is – where is the boundary between AI-generated music and human-generated music? In what scenarios do we want and not want AI-generated music?

AI Artwork
Generated with Midjourney using the prompt, a generative AI music model creating music.

In some situations, we may value the humanness of a piece of music. There are some works must be lived in order to be told. Many songs are motivated by human emotions of love and loss, joy and heartbreak. Eric Clapton’s song “Tears in Heaven” is an outpouring of his grief. Vivaldi captures the feelings of each season in his concerto “Four Seasons.” In the future, if an AI was able to generate a song equally as emotional, what is the point? Music connects us into a shared emotional state, but what does it mean if an AI generated that music? Where is the value in such music? However, there are also some situations in which we may not care about the story or meaning behind a piece of music. This could be the elevator music in a hotel or the background music for a YouTube video. In these scenarios, would we care about whether the music is AI-generated? Where do we draw the line between when it is meaningful to have human-generated versus AI-generated music?

In addition to these questions of the value and meaning of AI-generated music is the creative aspect of such music. The MusicLM authors are careful to point out that they analyze how much of the generated content is memorized. While humans creating music may be influenced by other works, they are also influenced by their life and past experiences. However, these AI models are trained only on data from large amounts of existing works, which calls into question the originality and creativity the generated music. There are limitations to these systems. Given all the data in the world, I’m not sure that an AI model could have created a work like Queen’s “Bohemian Rhapsody.” Perhaps the question to ask is not whether AI can be creative, but whether we should expect them to be creative. If we simply view AI-generated music as only generative or remixed music, I think the expectations change dramatically. If we take away the comparison to human-generated music, then AI-music occupies its own space. Listening to the samples generated from MusicLM, some of them, such as the accordion heavy metal are quite interesting and fun. Others sound repetitive and boring. The ones that attempt lyrical singing are a little strange and call into the question the source of the voice. However, there are also techniques such as the humming and whistling transformation that are unique. Humans and AI both have their limitations. Just as it is difficult for a human to do all that AI can, it is difficult for AI to do all that a human can. What we want is not a replacement of human-generated music but the evolution of a new kind of music.

The evolution and advancement of AI technology is inevitable. Whether it is a good idea is often secondary to the focus of whether it can be done. The current progress of generative AI models is shocking, yet this area is still in its infancy and will only grow increasingly advanced. There are so many open questions that it is crucial that we, as humans, continuously question and inspected with these ideas of want, need, and purpose of AI systems.