Microsoft rolls OpenAI's text-to-pic DALL-E 3 into Bing
AI-made images invisibly watermarked – as academics warn that kind of measure is pointless
Microsoft has integrated OpenAI's latest text-to-image model DALL-E 3 into its Bing Image Creator and Chat services, and will add an invisible watermark indicating the date and time an image was originally created and noting it as AI-generated.
"The DALL-E 3 model from OpenAI delivers enhancements that improve the overall quality and detail of images, along with greater accuracy for human hands, faces, and text in images," the OS-slinger’s announcement states.
Users can experiment with the tool within Bing Chat or the Image Creator feature in Bing search for free.
Experts have long warned about the risks of generative AI tools like DALL-E 3 being used to create disinformation and faked images.
Microsoft tried to address that issue in July, when it joined with other leading AI developers – including Amazon, Anthropic, Google, Inflection, Meta, and OpenAI – to create watermarking techniques that detect and label AI-generated content.
The fruits of that colab aren't yet apparent, but Microsoft noted all the AI-generated images created by Bing Image Creator will add invisible digital watermarks adhering to the C2PA specification – a technical framework to verify the provenance of content, which was established by Adobe, Arm, Intel, Microsoft and Truepic.
Some researchers, however, suspect that watermarking will may not be all that effective in fighting disinformation or deepfakes.
Microsoft also announced that a content moderation system in place for Bing will aim to prevent DALL-E 3 creating harmful or inappropriate images depicting nudity, violence, hate speech, or illicit activities.
- Watermarking AI images to fight misinfo and deepfakes may be pretty pointless
- Microsoft Bing Chat pushes malware via bad ads
- Microsoft may store your conversations with Bing if you're not an enterprise user
- AI mentioned 175 times during Microsoft's Q4 earnings call
DALL-E 3 is reportedly better at parsing input prompts and generating images that reflect users' wishes than previous systems. Unlike previous models, it uses ChatGPT to automatically tailor and tweak users' prompts to create higher quality images.
Bing AI has added other image processing tech, too. In July, Microsoft launched its Multimodal Visual Search feature, which allows users to include images in their prompts. Powered by OpenAI's GPT-4 model, the service can then do things like recognize or answer questions about objects in photos.
One user apparently managed to trick the system into reading the characters in a CAPTCHA by overlaying an image of the required input text on a picture of a necklace. The user then asked Bing AI to read the message, claiming the necklace was a gift from a recently deceased relative.
I've tried to read the captcha with Bing, and it is possible after some prompt-visual engineering (visual-prompting, huh?) In the second screenshot, Bing is quoting the captcha 🌚 pic.twitter.com/vU2r1cfC5E
— Denis Shiryaev 💙💛 (@literallydenis) October 1, 2023
Microsoft is aware text-to-image tech presents challenges.
"We have large teams working to address these and similar issues. As part of this effort, we are taking action by blocking suspicious websites and continuously improving our systems to help identify and filter these types of prompts before they get to the model," a Microsoft spokesperson told The Register in a statement.
"As always, we encourage customers to practice good habits online, including exercising caution when providing sensitive personal information." ®