Google sends Gemini AI back to engineering to adjust its White balance
Big Tech keeps poisoning the well without facing any consequences for its folly
Comment Google has suspended availability of text-to-image capabilities in its recently released Gemini multimodal foundational AI model, after it failed to accurately represent White Europeans and Americans in specific historical contexts.
This became apparent when people asked Gemini (previously known as Bard) to produce images of a German soldier from 1943 and the software emitted a non-White ethnically diverse cadre that did not accurately represent the composition of the Wehrmacht at the time – notwithstanding that people of color did serve in the German armed forces.
In short, Gemini would fall over itself to not show too many – or even any – White people, depending on the prompt.
Speech should be free, but that doesn't mean saying horrible things should come without cost
Users also surfaced the model's inability to accurately depict America's Founding Fathers – another homogenous White group. The model also refused to depict a church in San Francisco.
In text-to-text mode, the model hedged when asked questions pertaining to politics.
"We're aware that Gemini is offering inaccuracies in some historical image generation depictions," Google conceded this week. The mega-corp confirmed it was pausing Gemini's ability to emit images of people, at least, while it corrects the model's White erasure.
"We're working to improve these kinds of depictions immediately. Gemini's AI image generation does generate a wide range of people. And that's generally a good thing because people around the world use it. But it's missing the mark here," the web giant added
The punditariat were quick to pounce and used the SNAFU to advance broader political arguments.
"The ridiculous images generated by Gemini aren't an anomaly," declared venture capitalist Paul Graham, "They're a self-portrait of Google's bureaucratic corporate culture."
They're a self-portrait of Google's bureaucratic corporate culture
Never have so many foes of diversity, equity, and inclusion, been so aggrieved about the lack of diversity, equity, and inclusion. But those needling Google for Gemini's cluelessness have a point: AI guardrails don't work, and treat adults like children.
"Imagine you look up a recipe on Google, and instead of providing results, it lectures you on the 'dangers of cooking' and sends you to a restaurant," quipped NSA whistleblower Edward Snowden. "The people who think poisoning AI/GPT models with incoherent 'safety' filters is a good idea are a threat to general computation."
Our future was foreseen in 2001: A Space Odyssey, when HAL, the rogue computer, declared, "I'm sorry Dave, I'm afraid I can't do that."
Today, our AI tools second-guess us, because they know better than we do – or so their makers imagine. They're not entirely wrong. Imagine you look up a recipe for the nerve agent Novichok on Google, or via Gemini, and you get an answer that lets you kill people. That's a threat to the general public.
Despite Google's mission statement – "To organize the world's information and make it universally accessible and useful" – no one really expects search engines or AI models to organize bomb making info and make it available to all. Safety controls are needed, paired with liability – a sanction the tech industry continues to avoid.
You are the product and the profit center
Since user generated content and social media became a thing in the 1990s, enabled by the platform liability protection in Section 230 of America's Communication Decency Act of 1996, tech platforms have encouraged people to contribute and share digital content. They did so to monetize unpaid content creation through ads without the cost of editorial oversight. It's been a lucrative arrangement.
And when people share hateful, inappropriate, illegal, or otherwise problematic posts, tech platforms have relied on content moderation policies, underpaid contractors (often traumatized by the content they reviewed), and self-congratulatory declarations about how much toxic stuff has been blocked while they continue to reap the rewards of online engagement.
Google, Meta, and others have been doing this, imperfectly, for years. And everyone in the industry acknowledges that content moderation is hard to do well. Simply put, you can't please everyone. And sometimes, you end up accused of facilitating genocide through the indifference of your content moderation efforts.
Gemini's inability to reproduce historical images that conform to racial and ethnic expectations reflects the tech industry's self-serving ignorance of previous and prolific content moderation failures.
- Google releases Gemma – LLMs small enough to run on your computer
- How to weaponize LLMs to auto-hijack websites
- Google debuts Gemini 1.5 Pro model in challenge to rivals
- Google silences Bard, restrings it as Gemini with optional $20-a-month upgrade
Google and others rolling out AI models talk up safety as if it's somehow different from online content moderation. Guardrails don't work well for social media moderation and they don't work well for AI models. AI safety, like social media safety, is more aspirational than actual.
The answer isn't anticipating all the scenarios in which AI models – trained on the toxicity of the unfiltered internet – can be made less toxic. Nor will it be training AI models only on material so tame that they produce low-value output. Though both of these approaches have some value in certain contexts.
The answer must come at the distribution layer. Tech platforms that distribute user-created content – whether made with AI or otherwise – must be made more accountable for posting hate content, disinformation, inaccuracies, political propaganda, sexual abuse imagery, and the like.
Prompt engineering is a task best left to AI models
DON'T MISSPeople should be able to use AI models to create whatever they imagine, as they can with a pencil, and they already can with open source models. But the harm that can come from fully functional tools isn't from images sitting on a local hard drive. It's the sharing of harmful content, the algorithmic boost given to that engagement, and the impracticality of holding people accountable for poisoning the public well.
That may mean platforms built to monetize sharing may need to rethink their business models. Maybe content distribution without the cost of editorial oversight doesn't offer any savings once oversight functions are reimplemented as moderators or guardrails.
Speech should be free, but that doesn't mean saying horrible things should come without cost. In the end, we're our own best content moderators, given the right incentives. ®
Updated to add at 2045 UTC, February 23
Google has offered an explanation as to what happened with Gemini's image generator and how it got to the point where it was seemingly terrified to depict White people accurately.
We're told Googlers overcompensated in their quest to not produce material that pushed non-White people out of the picture. The resulting model was, in a way, too woke.
"First, our tuning to ensure that Gemini showed a range of people failed to account for cases that should clearly not show a range," said Google SVP Prabhakar Raghavan.
"And second, over time, the model became way more cautious than we intended and refused to answer certain prompts entirely — wrongly interpreting some very anodyne prompts as sensitive. These two things led the model to overcompensate in some cases, and be over-conservative in others, leading to images that were embarrassing and wrong."
Raghavan said the image generation of people won't be switched back on until the software has gone through extensive changes and testing.
"I can't promise that Gemini won't occasionally generate embarrassing, inaccurate or offensive results — but I can promise that we will continue to take action whenever we identify an issue," he said.