Updated to add Are you a bad dancer? Do you want to pirouette like a ballet pro? Don’t worry you can fake it until you make it – with the help of artificial intelligence.
Some clever cookies at the University of California, Berkeley, in the US have developed machine-learning software that use generative adversarial networks (GANs) that can copy moves by professional performers, and map them to anyone’s face and body. Thus, you can make anyone jig, cut a rug, or twirl like a pro, based on the movements of a stranger.
“Given a video of a source person and another of a target person, our goal is to generate a new video of the target person enacting the same motions as the source,” the researchers wrote in a paper that appeared on arXiv this week.
Here’s a video of the results, and they’re pretty good in this vulture's opinion...
Before the boogieing can be transferred from the skilled source to the clumsy target, there is an intermediary step. The researchers use a trained neural network to convert the source body's poses and movements into animated stick figures matching those gestures. This step is required because it isn't possible to perform a direct one-to-one mapping from the source to the target. Instead it's easier to simplify the source body into lines, and then use those to manipulate the target's body into new poses.
Armed with frames of the stick figures, and a video of the target, the model can begin meddling with the target's appearance to make them move like the stick figure, using an image-to-image translation model. The target subject was filmed for about 20 minutes at 120 frames per second to collect training data, we're told.
The image-to-image model is a generative adversarial network made up of a generator and discriminator. It uses pairs of images – a video frame of the stick figure and of the target – as inputs. A generator produces an image of the target in the same position as the stick figure, and a discriminator tries to determine if the image is real or fake. Over time as both battle against each other, the generator comes up with realistic enough images to fool the discriminator.
These images are then combined into a video that appears to show the target is a good dancer. There is also a separate training system to deal with the position of the face of the target to make the videos look as realistic as possible. It’s no good if the body looks believable but the face is blurry.
Here the researchers use another GAN to place the target’s face using the head of the stick figures.
The results are pretty good for a first attempt, but not perfect. Sometimes small details such as the hands or feet can be a little blurry. Sometimes the arms appear smudged. An observant viewer should be able to tell that something's not quite right, and that some of the material has been manipulated. It’s also a little unnerving that the expressions on the dancers' faces are pretty deadpan throughout the videos.
Errors are more likely to occur of there are “missing or incorrect keypoint locations from pose detection,” the paper explained. The stick figures can’t quite capture all the different positions especially if the dance is particularly fast or has fancy footwork like in ballet.
There are fears this sort of technology can be abused to create fake videos or images that trick people into believing stuff that never really happened. Remember when people on Reddit started sticking the faces of actresses and models onto the bodies of smut stars? The doctored content, dubbed deepfakes, was eventually banned from the message boards, however, it highlighted the potential dangers of using AI to create and spread false information.
We can imagine someone weaponizing this latest software to make someone appear to perform an illegal act, such as snorting coke, throwing a punch at a stranger, and so on.
The developers of the dancing AI – Caroline Chan, Shiry Ginosar, Tinghui Zhou, and Alexei A. Efros – were not available for immediate comment – we'll let you know if they get back to us with answers to our questions. ®
Updated to add
"We have absolutely thought about this before this project. Fake content creation is a huge concern these days and our group is actively working in parallel on multiple projects geared towards detecting fake video and still image content. As a community it is important to us to both advance the state of the art in content creation and be able to separate fake from real content with high confidence," Shiry Ginosar, co-author of the paper and a PhD student at UC Berkeley, told The Register.
"One thing to note is that the movie industry has been perfecting techniques for manipulating reality with special effects. One of the earliest successful attempts that I can remember is Forrest Gump. Since then many movies have used various techniques for video manipulation and content creation. Until recently, these movies required a huge amount of manual editing and effort. Generative models like our own can enable content creators to achieve similar results with less effort and budget. Thankfully, since these new approaches essentially work by learning a model of what real data looks like, they are also very good at detecting fake content that was manipulated in any way or created from thin air," she added. ®