Generative adversarial networks (GANs) are a brilliant idea: get two neural networks and pit them against each other to get a machine to generate completely new, realistic looking images. But in practice they are notoriously difficult to train and deploy, as one engineer told El Reg.
Jason Antic, a deep learning engineer, runs a tiny two-person operation to develop AI tools that touch up old black-and-white photos with a fresh splash of colour. He then licenses out the machine learning software to genealogy companies like MyHeritage, where users might want to share restored photos among their family members online. All commercial code is based on a free, open-source version that Antic helped build called DeOldify.
He experimented with GANs for the DeOldify tool at first. To teach the model to know what colours to apply to specific regions of an image, he built up a training dataset with millions of photos. One half of the data simulated old images, where the colour had been stripped away and pockmarks of noise were added. The other half were made up of the original, full colour photos.
By showing the GAN the before and after stages, it should learn how to sharpen and colour people's photos. But it didn't always work out. Sometimes body parts that should be flesh coloured would come out looking a purplish grey, something Antic called "zombie skin".
Generator vs Discriminator
GANs are notoriously unstable and tricky to train, Antic explained. The goal is to get the generator to create fake images that are convincing enough to trick the discriminator into thinking that they're real. In DeOldify's case, the generator has to craft a coloured photo that's realistic-looking enough for the discriminator to believe it's a genuine photo like the ones it's seen in the training data.
At first, the generator is terrible and the discriminator can easily tell it has done a bad job. Over time, however, with more training, the generator manages to fool the discriminator. How much training exactly is difficult to judge, and developers often don't really know when to stop training their models without going through a trial and error process.
Antic told us: "The big difference with GAN-based models and other types of machine learning algorithms is the stopping point is a lot clearer. You'll typically look at where the loss function is at a minimum and where performance plateaus.
"But with GANs, it's so difficult to tell because it randomly gets worse and then better again and you can't really figure out how or why very easily. There is no really good indication of when you've trained it sufficiently long enough; the best way of assessing your model is to just look at the images themselves."
Trying to fix the model by twiddling around with the parameters is another challenge too. Changing one property of one neural network has unpredictable effects on the other. "It's a bit like chasing a phantom or playing a game of whack-a-mole," Antic said.
All sorts of things can go wrong, and either you have to keep fiddling around with it or just retrain it altogether. Unfortunately, this process consumes more time and computational resources and it isn't something that all companies can afford. Antic said he has a workstation with four GPUs that sucks up so much electricity that he has to be careful not to blow a fuse at home.
The commercial code Antic provides for MyHeritage doesn't use GANs because they're too inconsistent and unpredictable. Instead, he uses other machine-learning algorithms that are simpler to control and provide better results.
Antic still believes that GANs are the future, however. "They replace what is essentially hardcoded by engineers, and that is a good step in the right direction. There are definitely ways to get them to be more reliable, ways that will keep you from going insane. But they’re just not ready right now."
He advised engineers thinking of implementing GANs: "Don't chase the trends, chase the results." ®