This article is more than 1 year old

Oomm-tsss, oomm-tsss, Oomm-tsss, oomm-tsss... it's an AI beatbox

Press record, make some noise into your mic, press play, voila – all in your browser

Vid AI can now beatbox for you for hours on end using your voice, if you're into that kind of thing.

Nao Tokui – a visiting associate professor at Kyushu University in California and a CEO of Qosmo, an AI and music startup – has developed a neural-network-based system that collects about 20 seconds of any sound to produce a custom drum kit, and then automatically sequences rhythms using those utterances and noises.

Any snippet of audio can be used as input, from your own voice to improvised percussion. In a video demo of the JavaScript-based code, Tokui gently slaps his cheek, and flicks a plastic bottle. The sounds are recorded by his computer’s microphone, and fed into the software to generate a rhythm from the audio:

Youtube Video

Whatever's recorded by the code is automagically split and assigned to the instruments that make up the virtual drum kit, such as the kick drum, snare, hi hat, and tom-toms. After all this, the model strings together combinations of the kit's components into a sequence to produce a loop that you can bop your head to.

The project, dubbed Neural Beatboxer, is made up of two parts, Tokui explained. The first part uses a convolutional neural network model to study the spectrograms of the recorded sound, and break it up into individual instruments. The second part uses a recurrent neural network to generate music – it’s actually based on a pre-trained model called DrumsRNN built by Google's Magenta team.

“Once the drum kit is ready, the RNN model starts generating new rhythm patterns based on a-half-bar-long 'seed' pattern,” Tokui told The Register this week.

"For this part I copied a large chunk of source code from Tero Parviainen's Neural Drum Machine. It starts with a very simple seed pattern containing only one kick drum and one hi-hat, and every two bars, the RNN model generates the next two bars using slightly more complex seed patterns."

The model isn’t super complicated compared to more recent projects. It all runs in Google's Chrome browser, so you can try it out right here if you have that.


Here's why AI can't make a catchier tune than the worst pop song in the charts right now


People have different opinions and tastes in music. Some people hate the idea of AI making music, and believe that everything that comes out of a neural network is nonsense, while others think it’s an interesting area to pursue and doesn’t sound too bad.

“I feel a bit frustrated with the current research direction of AI music generation system,” Tokui told El Reg.

"The outputs of those researchers tend to sound very homogenous, even though there has been massive progress in methodologies in the last decade. I believe it is partially due to the fact that they all use MIDI synthesizers. It is understandable because their goal is not to make interesting or novel music itself, but to propose novel algorithms to generate convincing music.

“Technically, there is nothing really new in this project, and the output of this system will never sound like Bach or The Beatles. Here I deliberately took a different approach. My intention here is to make interesting, weird, and eccentric novel rhythms for music that a human composer might not think of.” ®

More about


Send us news

Other stories you might like