Everyone's suing AI over text and pics. But music? You ain't seen nothing yet
When record labels go bananas over brief samples, good luck generating tracks built from today's culture
Comment Generative AI models are most known for knocking out text and pictures, though they're also getting some way with audio. Music is particularly tricky, arguably: as humans, we can be relatively forgiving with machine-imagined imagery and some forms of writing, but perhaps not so much with audio. People can be very picky about the sounds they like listening to.
That's not the only difficulty facing AI-made music: there's also copyright law, which artificial intelligence in general is starting to run into more and more across all forms of media.
Huge amounts of data are required to train these systems to reproduce common patterns and behaviors. Startups and Big Tech alike have scraped huge swathes of the internet, raiding news publishers, web forums, books, picture-sharing sites, and more for content. Yet they're more careful when it comes to using music. It's not hard to see why.
Record labels are fiercely litigious. Last year bunch of music publishers led by the Universal Music Group (UMG) sued AI upstart Anthropic in October, accusing it of stealing lyrics. And that's only the words – we all know what happens when samples, or what sounds like samples, are used in tracks without permission. Lawsuits are filed and royalties demanded.
If you're making music, and basing it off other people's work, you need to get that copyright cleared. And we imagine AI makers that feed today's music into their models during training will have to go through this too, somehow.
Imagine the trouble ML developers will land themselves if they scrape copyrighted music and create chart-topping hits that contain familiar elements, much like most AI-made content can be traced back to some portion of the training data, without permission. AI can now create award-winning art, so we guess music is next.
The track Heart On My Sleeve, which was generated using AI and copied the voices and musical styles of rapper Drake and Canadian musician The Weeknd, was made by a mysterious producer known as Ghostwriter and went viral. UMG promptly stepped up again, demanding it be removed from streaming platforms. It's clear neural networks can create convincing pop music, but like with art and writing, if the output is too close to the original training data, copyright claims will fly and users may hesitate to use the technology for fear of litigation.
Some AI developers, wary of legal battles with record labels, may even decide to train their models on music they themselves have created or commissioned, or have permission to do so, and it will be interesting seeing how those neural networks' output compares to that of networks trained on a wider set of audio that may or may not have been harvested lawfully.
- Here's a list of thousands of artists Midjourney's AI is ripping off, creatives claim
- What the AI copyright fights are truly about: Human labor versus endless machines
- New York Times sues OpenAI, Microsoft over 'millions of articles' used to train ChatGPT
- Artificial intelligence is a liability
Generally speaking, though, AI makers believe training their models on copyrighted material is fair use. They also argue the output of large language models are transformative, meaning they add something new and aren't a direct copy or substitute of original works. It's safe to say not everyone is convinced by those points.
Powerful models capable of creating coherent content are increasingly being accused of plagiarizing intellectual property. A lawsuit filed by the New York Times claimed OpenAI's ChatGPT can, among other things, recall passages of news articles verbatim, giving people an easy way to bypass the title's paywall.
Similarly, illustrators and artists have shared images generated by Midjourney that replicate movie stills, as shown below:
I consider this a smoking gun for Midjourney's flagrant copyright infringement. A six-word prompt can replicate a Dune still nearly 1:1 every time. These aren't variations, it's the same prompt run repeatedly.
— Reid Southen (@Rahll) December 24, 2023
Try it yourself. Merry Christmas Midjourney. pic.twitter.com/2wpeTwxS0Q
It's likely record labels won't have to prove copyright infringement as explicitly as other publishers have done for text and images. Musicians and their labels have sued each other for less blatant rip-offs, after all; a similar chord progression or guitar riff, or a brief sample, is enough to launch a court case. So where does this leave AI music generation?
The threat of lawsuits means that those working to build models capable of generating music must have deep pockets to fend off music publishers, or compensate artists for explicit permission to use their work. Google, for example, has negotiated licensing agreements with a select group of singers and rappers to train its Lyria AI model.
This introduces other issues. Is it fair that copyright laws impede small startups from competing against Big Tech? How can musicians and developers, large and small, work together to advance AI ethically? And if synthetic music does take off one day, will it be commercially viable considering copyrighting AI content is a legal gray area that has yet to be resolved. ®