Critical Response #1
What do you (really) want from AI music generation?

by Marise van Zyl

Description of this reading response:
For this critical reading response, we had to reflect on MusicLM (or a similar system) and to discuss what we would truly want from an AI system that could make music. MusicLM is a recently (in Feb, 2023) announced prompt-based text-to-musical audio generation model. At the time of writing this, MusicLM had not yet been released, but examples, a paper, and a dataset can be found here.

In true AI fashion, my response will be fashioned as a conversation between myself and ChatGPT(who better to ask about AI than an AI itself).

Marise(M): ChatGPT, what is MusicLM?
ChatGPT(CG):"MusicLM is a machine learning model developed by OpenAI that generates music in a specific style or genre. It uses deep learning techniques to generate musical sequences that are coherent, harmonious, and expressive, while also capturing the essence of a particular musical genre."

M: So why would someone use MusicLM?
CG: "MusicLM can be used for a variety of purposes, including:
Music composition: MusicLM can be used as a tool for composing music in a specific genre, style, or mood.
Music generation for media: MusicLM can generate background music for movies, games, and other forms of media.
Music education: MusicLM can be used to teach music theory, composition, and arrangement by demonstrating how different styles of music are constructed.
Music research: MusicLM can be used to study and analyze various aspects of music, including harmony, rhythm, and melody.
Overall, MusicLM is a valuable tool for anyone interested in creating, understanding, or studying music."

M: Does MusicLM make better music than humans?
CG: "It is subjective to determine whether MusicLM generates "better" music than humans. Music is an art form that is highly subjective and personal, and what is considered good or better music can vary greatly from person to person.
MusicLM generates music that is harmonious and coherent, and it can effectively capture the essence of a particular musical genre or style. However, it lacks the emotional and creative expression that is unique to human musicians. While MusicLM can generate musical sequences that are technically sound, they lack the soul and personal touch that are essential to truly great music.
In the end, the quality of music generated by MusicLM depends on the input and parameters given to the model, and it is ultimately up to individual listeners to decide whether they prefer music generated by MusicLM or by human musicians."

M: That sounds interesting. Does that mean MusicLM will put musicians out of business?
CG: "No, MusicLM is not likely to put musicians out of business. While MusicLM is a powerful tool for generating music, it is not a replacement for human musicians. Music generated by MusicLM can be used as a starting point or as inspiration for music creation, but it lacks the emotional and creative expression that is unique to human musicians.
Music is a complex art form that involves much more than just technical skills, and it is the human touch that makes music truly special. People will always appreciate the creativity, passion, and emotional connection that live musicians bring to their performances. In fact, MusicLM has the potential to complement and enhance the work of musicians by providing them with new tools for composition, inspiration, and expression."

M: So, you're saying that this wonderful model that can make any music you want, won't put human musicians out of business. What do humans have that MusicLM does not?
CG: "MusicLM is a powerful machine learning model, but it lacks certain qualities that are unique to human musicians:
Emotion: Music generated by MusicLM may be technically sound, but it lacks the emotional expression that is the hallmark of human musicians. Human musicians bring their own unique feelings, experiences, and perspectives to their music, which gives their music depth, meaning, and soul.

Creativity: Human musicians have the ability to create music that is original and unique. MusicLM can generate music that is harmonious and coherent, but it lacks the ability to think outside the box and come up with truly new and innovative musical ideas.

Improvisation: Human musicians have the ability to improvise and create music in real-time, responding to their musical environment and the audience. MusicLM can generate music based on a set of parameters, but it cannot respond to real-time stimuli or make creative decisions on the fly.

Physical interaction: Human musicians have a physical connection to their instruments and the music they are creating, which allows them to express themselves through the physical act of playing music. MusicLM generates music through mathematical algorithms, and it lacks the physical interaction and connection that are integral to human music making.

In short, while MusicLM is a powerful tool for generating music, it lacks the emotional, creative, and physical aspects of human musicians that make music a truly unique and special art form."

M: Final question, how can we use MusicLM creatively in the music-making process?
CG: "MusicLM can be used in a variety of creative ways in the music-making process:

Inspiration: MusicLM can be used as a source of inspiration for music creation. It can generate musical sequences that can serve as a starting point for musicians to build upon and develop into their own unique creations.

Composition: MusicLM can be used as a tool for composing music in a specific genre, style, or mood. This can be particularly useful for those who are less familiar with music theory or are looking to experiment with new musical ideas.

Collaboration: MusicLM can be used in collaboration with human musicians to generate musical sequences that can then be performed, recorded, and incorporated into musical compositions.

Music production: MusicLM can be used to generate musical elements, such as background music, sound effects, or ambient soundscapes, that can be used in music production and media creation.

Music education: MusicLM can be used in music education to demonstrate the structure and composition of different musical styles, which can help students gain a deeper understanding of music theory and composition.

In short, MusicLM has the potential to be a valuable tool in the music-making process, providing musicians and producers with new avenues for creativity, experimentation, and collaboration."

Going into this reading response, I was just going to write about MusicLM from my own perspective. I listened to the examples and noted that the acoustic instruments sounded pretty terrible. I was going to comment on audio quality and how, although this is a fun way to think about genre and other classification type things, it's not really a musically-sensitive way to create something. The orchestral instruments sound horrible and the weird quasi-linguistic lyrics are haunting at best. MusicLM learned a pattern and is spitting out what it knows. After my conversation(?) with ChatGPT, however, I have a whole new view of it. I think even AI itself can see what MusicLM is missing in terms of musicality. It opens up many doors for innovation and creation, yet is in itself not the endgoal. It is not the end product, but the spark that lights the flame.

I do agree with ChatGPT that real musicians won't be completely replaced. At least not yet. I do think that previous genres and current-ish popular music could be rather easily replicated, at least to some extent. It is when it comes to innovation and creating new trends that humans still ahve the upper hand. ChatGPT itself says that is does not have a lot of information on 2021 and later. AI is trained on already existing data and thus can't really be expected to create novel ideas or sounds. It might combine old ones in interesting ways or allow human interaction to guide its 'innovation', but humans are still the ones setting the trends. In, performance, our human propensity for making mistakes or doing something unexpected is also unique to our non-computer brains. There is a certain amount of unpredictability in human performance and this is what makes it exciting. There is an energy which a machine can't reproduce. Surely a computer DJ at a club or computer violin as a soloist with an orchestra might do a much better job at performing accurately, but we want to see a real human there. We want to be there and share the experience with them.

I think MusicLM is great. It's a super fun tool that allows us to explore sounds that might only have been in our heads before and can lead to uniquely human compositions through the help of AI. I do, however, think there are moral/ethical element to consider. Just as the rise of ChatGPT has caused concertn amongst academic staff for students using it for writing assignments, so will musicians/teachers/etc, start wondering if a piece was made using MusicLM. I think the main factor should be transparency. It should be clear what the machine did and what the human did. THere are obvious patterns in life and whether a machine or a humans copies them, it's largely irrelevant. It is what comes after the pattern recreation that counts. How does one personalize the written piece or the music?

As a side note, and I'm sure this has already been done, but I want to see a collab between MusicLM and ChatGPT.


Ge, YikAI, ChAI, Celeste, ChatGPT, and Bruno