Rhythm and Algorithms: Exploring the Role of AI in the Music Industry

Jamye Molina
May 12, 2023

There has long been a debate about the intersection of music and technology. In a field with such an artistic field, there may always be two sides to the coin.

And with the release of GPT-4, AI has now reached a new level of sophistication, with multimodal model capabilities which have even more potential to revolutionize the industry.

I have a feeling this one is going to be quite controversial. Let's jump in.

Creating Lyrics

When it first came out, many TikTok and YouTube creators were trying to make song lyrics with ChatGPT, and they were getting funny responses using prompts like;

"Write a Drake verse about how he doesn't like beans and chili."

Since then, several platforms like Jarvis-Lyrics, which is now using the GPT-4 technology, have come out that are specifically created to generate full songs. You can input things like genre, title, and artist to tailor it to your preferences.

Is it successful? Hard to tell since the testimonials on the Jarvis Lyrics website are generated by AI.

However, Malaysian YouTubers, Steady Gang, asked ChatGPT to write a song about Taiwanese pop artist Jay Chou which ended up creating a parody that shot to the 3rd spot on YouTube's "Top Songs."

"Write a song with Jay Chou as the topic."

“Can it be funnier?”

“Can you make it rhyme, hello?!”

Still, I think the success in both cases came from the hype of the AI itself rather than its lyrical intelligence.

Algorithmic Composition

For now, ChatGPT is still only capable of generating text output, so you'd need a musician, or some other platform to actually play the music.

But Twitter user Jeffrey Emanuel tested its ability to create and modify chords with prompts like;

"Continue this guitar pattern:"

and

“Modify the following progression to be more like Bach:”

ChatGPT provided text-based chord charts that can then be played by a musician.

Does it work well? Musician Ezra Sandzer-Bell pointed out that while it was interesting to see the AI's ability to replicate the chord chart it was given, a skilled composer would know the difference between ChatGPT's musical progression for Bach's.

Text to MIDI to DAW

But can AI actually play the music? To better understand how AI can be used for music composition, let's take a look at the basis of digitalizing music; MIDI, or Musical Instrument Digital Interface.

Why is MIDI important for AI? Well, for starters, the algorithms can be trained on large datasets of MIDI files (information about note pitches, velocities, timing, and other musical parameters) to learn patterns and structures in music. Plus, it can be used in combination with other types of data such as audio recordings and lyrics to create more sophisticated music-based AI systems.

Additionally, MIDI data can be used as input to machine learning models for tasks such as music transcription (converting audio recordings to MIDI data), genre classification, and music recommendation.

So, platforms like AudioCipher, and Google's MusicLM are to music producers what generative image services like Dalle-2 and Midjourney are to graphic designers. They are text-to-MIDI music plugins.

Once the MIDI data is transferred over to a DAW or Digital Audio Workstation, it can be edited, arranged, and mixed. For example, users can adjust the timing of individual notes, change the velocity of specific MIDI events, transpose notes to different keys or scales, and apply a variety of effects and processing to the MIDI data.

So, essentially, by combining the power of several AI-powered tools, you can create music by converting text to MIDI and then importing the MIDI data into a digital audio workstation (DAW).

AI DAW

A new AI DAW called WavTool just came out using GPT-4 technology.

It features an AI chat assistant called Conductor that understands music theory and audio engineering concepts and can do everything from creating MIDI to writing beats to adjusting audio levels and sound devices, with the right prompts.

So while ChatGPT can produce chord progressions and melody ideas, its capabilities are restricted to generating text. In contrast, WavTool uses commands from GPT-4 to create musical instructions that can be executed by the DAW.

This text-to-music feature is something completely new and, so far, unique to WavTool.

Another noteworthy aspect of WavTool is that it is the first instance where you can ask why the AI made particular decisions which can help you provide feedback on how to improve its outcomes and refine the project using ongoing prompts.

This makes it easy for users to collaborate on song segments similar to human collaboration.

Musician Ezra Sandzer-Bell shared his prompt sequence for creating his first beats in a YouTube video that walks you through the entire process.

His prompt started with;

“Can you make me a simple Triad chord progression in E Minor?”

The result at this point was basically a one-bar loop repeating itself that had basically no rhythm. But the key to success here, which Ezra quickly found out, is being consistent. You’re not going to get a fully composed song with just a simple prompt. BUt if you stick with it, communicate what you want to change, add, or tweak, you can end up with a pretty good final product.

So he continued with follow-ups like;

“Can I have more variety in my chord progression?”; “Make each chord one bar long”; “I need a lofi melody"; add reverb and delay since we're doing lo-fi.” “EQ out some of the high end for the melody.”

One thing to point out here is that, in order to come up with these prompts that ultimately make the music composition process successful, it still requires quite a bit of knowledge about music production, and might I say talent!

Creating AI Music Videos

Linkin Park recently used AI to create the music video for their song “Lost.”

You can find a seed image or generate your own on a platform like Lexica or Stable Diffusion. Once you have your seed image, you can create a generative video with Kaiber.AI (paid) to animate that photo and actually generate a synchronized video to go with your music.

First, upload an image, then trim audio if desired, and type the prompt into Kaiber. Sharp Startup recommends using a prompt that involves motion words. His says;

"Astronaut dancing and jumping in the street."

Then just click continue, and now you type in the style of art that you want your video to have. Like;

"Vintage 90s anime."

You can then adjust video features like zoom, etc., and then just click generate and wait for the video to finish generating.

Other possibilities

In addition to the discussion centered around ChatGPT's role in music creation, it is worth acknowledging its potential to help emerging artists with tasks like marketing, booking gigs, managing schedules, etc.

Emerging artists are like small businesses, and ChatGPT can offer valuable insights for aspiring artists for the marketing strategies that are most effective, especially for self-managed artists

For example, Youtuber, Jimmy Make Music's prompts;

“What’s the best social media for music makers?”

or

“How can I get a hip-hop song on a Spotify playlist”

Tatiana Cirisano from MIDIa Research shared a few useful prompt suggestions for this use case as well;

"Generate creative ideas for a virtual listening party for a hip-hop artist’s new album."

"As a music artist, what are some creative ways to encourage fan-generated content?"

"Over the next 5 days, I want to finish writing a new song, generate a marketing plan to release it and add 50 TikTok followers. Give me a schedule for reaching my goals."

The Groove Cartel also points out the potential of ChatGPT in things like music training. Analyzing performances and offering feedback, as well as creating personalized practice routines.

Plus, now that GPT4 can interpret images, generating captions for music videos, analyzing album art for marketing and branding ideas, or even analyzing concert photos for insights into audience behavior are all possible.

Conclusion

So, it's not too far-fetched that GPT-4, or any other large language model, is generating musical compositions since natural language processing can manipulate and produce text while music can also be represented as a sequence of symbols that can be processed in a similar way.

However, whether or not it can create top-notch music is, and probably will be for the near future, a highly debatable topic.

And there may be some areas where it may be a bit less controversial. For example, as Ryan Long discussed on Joe Rogan's podcast on ChatGPT Music, generative AI could be very useful in automating the creation of stock music, or short jingles intended for commercial use on, let's say, a TV show or movie.

But, even in such an artistic industry, it's worth noting that musicians and music producers may benefit from staying ahead of the AI curve and learning how it can enhance their workflows in case in the future, jobs in the music industry, and any industry for that matter, trend toward AI prompt engineering and management!