Feature AI are often seen as cold, calculating machines, devoid of any warmth or humanity. One way to make AI more relatable and human-like could be encouraging them to take part in human activities like making music.
Using AI is one of the geekiest ways to make tunes, and has been around since the 80s. It’s a thriving area of research with dedicated academic conferences. And with the recent boom in machine learning, it also means the quality of music created by AI seems to be getting better too.
Researchers from the University of Toronto in Canada have trained recurrent neural networks to make an all-singing and dancing AI. A paper submitted to ICLR 2017, an academic conference for machine learning, shows that artificially-intelligent software can not only process data, it can create art, too.
Machines don’t have a wild, unlimited creative streak like humans, however. The process of making music is analytical rather than emotional. They don’t understand music, but they can do maths.
In musical theory, scales are a sequence of notes that follow a particular pattern, and they change depending on the starting note. By feeding the neural network a particular scale, it gives the system a series of notes it can choose from to make a melody.
“The input is the choice of scale (for example, C major), as well as what we call a 'profile.' A profile is a 'general feel' of the songs,” the research team explained to The Register.
Based on the starting note, there are only a few notes that can be played according to the scale. Music is played over various scales to create a certain sound that is easy on the ear, like jazz or blues.
Although the neural network has been encoded with the rules of musical theory, it can’t make something out of nothing. It has to learn by examples, so the team analysed the chords in 100 hours of pop music to learn about common patterns of notes and melodies.
So the neural network generates music by weighing up the probabilities of what note should go next, according to the scale it’s working in. Although there are 12 notes, only six or seven of those notes are usually part of a scale.
First, the input scale is chosen, and the initial layer of the neural network decides what key it should play the music in. The possible range of notes is already known from the scale, and the system chooses the combination by learning the patterns in pop music.
The second layer decides how long the key should be played for. Unlike jazz, which is trickier to play and more unpredictable, pop music has a repetitive structure that is easier to analyse and produce.
A third layer picks the chords to go along with the melody, and the fourth is for drums. All layers work simultaneously to give an output combination of notes at specific timings to create a song that sounds pretty convincing.
Hang Chu (a PhD student) and Raquel Urtasan and Sanja Fidler (both associate professors) all work at the University of Toronto as researchers in computer vision, but became intrigued to see if the underlying principles of good pop music could be captured in algorithms.
“Plus, wouldn’t it be fun to have an AI-Pop music channel on Pandora or Spotify?,” they asked.
Humans like to boogie when they hear good music, so AI should too, right? Well, now they can. To go along with AI music, the team created something called “neural dancing and karaoke” with a computer stickman.
The stickman learns common dance moves by watching an hour of footage from the game Just Dance. Another layer is added on top of the music layers in the neural network and generates a single move for each beat.
Little training time means the stickman is pretty limited and dances like a dad at a wedding. It only seems to be able to flap its arms around whilst rotating its body at various angles. The lyrics aren’t a strong point either. It doesn’t generate any deep, meaningful lyrics and occasionally has sentences that don’t make sense.
It only collected 50 hours of text used as lyrics over the Internet, and paired them with an hour of music from Just Dance, meaning the system’s vocabulary is small, as it only registered words that appear more than four times.
The same vocabulary is applied to the karaoke too. The neural network sings songs and generates lyrics describing an input image. It sings over a limited range, in a strange robot-like voice.
Although some of the features like the dancing are pretty crude now, there’s plenty room for improvement if something like an AI karaoke bot is desirable.
“Our AI pop composer has a few adjustable parameters which would allow users to specify the kind of music to be composed in a very intuitive way. So given any mood you are in, you could just simply tell your AI-Karaoke machine what kind of songs to generate and play,” the researchers said.
Fine-tuning computers to do human things is a big interest for many researchers working in AI – sometimes it's purely for curiosity reasons or a way to benchmark progress. If it can behave just as well or even better than humans at specific tasks, it’s seen as impressive.
Music is just one of those challenges, and it seems like a popular choice. Google Brain’s Magenta project has made a lot of its codes open source on its Github profile, and just announced a new algorithm that uses reinforcement learning to train systems to predict notes in music.
In France, a song called Mr Shadow, which was composed by Sony Music’s AI composer, Flow Machine, with the help of Benoît Carré, a pop musician, frequently plays on FIP, a popular radio station.
Mr Shadow is just over three minutes long, with lyrics that make more sense and sounds like it could have been made by a human. Fiammetta Ghedini, a spokesperson for Sony’s Computer Science Laboratory in Paris, told The Register that it was “natural” for AI to explore music.
“Many human activities are already performed by AI and many more will be performed in the future," said Ghedini. "It’s just natural to explore what AI can do with music. It is also natural from a technical point of view, as the algorithms we use are very well adapted to generating sequences.”
Although AI is moving into music, it’s not going to take people’s jobs away, Ghedini said. She compares it to photography. “Photography did not mean that art as a human activity was dead, but just a tool that opened up new creative possibilities, both within photography and art.”
Sony also released Daddy's Car, another song composed by AI in the style of The Beatles.
Although the songs are just imitations, it may be possible one day to ask AI to create songs influenced by old bands. The music will sound remarkably similar and will be sung in the lead singer’s voice, breathing new life into old music. It gives people a new way to be creative and deepens the relationship between human and machine.
You can listen to AI pop music and watch the dancing stickman in action here. ®