After some users of Bing’s DALL-E 3 integration found a loophole in the tool’s guardrails and generated art featuring several beloved animated characters and the Twin Towers, Microsoft seems to have blocked prompts like ‘twin towers’ and ‘world trade center’ — although the generator will still produce the towers with some word changes.
As reported by 404 Media, users of Microsoft’s Bing Chat and its Bing image generator — recently integrated with OpenAI’s DALL-E 3 — used the tools to create photos of SpongeBob SquarePants, Kirby, pilots from Neon Genesis Evangelion, and many others flying a plane into the Twin Towers.
People have been able to create truly unhinged photos using AI image generators, some featuring copyrighted characters. But as AI image generators have gotten into hot water over copyright claims and deepfakes, developers have been more careful about allowing people to use their tools to create questionable photos. DALL-E 3 developer OpenAI had promised it would not generate pictures from prompts featuring prominent names.
Caitlin Roulston, director of communications at Microsoft, said in an emailed statement to The Verge that the company plans to improve its systems “to help prevent the creation of harmful content.”
“As with any new technology, some are trying to use it in ways that were not intended, which is why we are implementing a range of guardrails and filters to make Bing Image Creator a positive and helpful experience for users,” Roulston said.
Some Verge writers were able to generate pictures similar to those 404 described, including famous Italian plumber Mario flying a plane with a view of the Twin Towers outside the cockpit. But when I tried to recreate it with Bing Image Creator after I reached out to Microsoft, I found the term “twin towers’’ had been blocked and was hit with a content warning saying the prompt possibly violates content policies. A colleague got the same response for prompts simply asking for “the Twin Towers” as well as “the World Trade Center.”
Microsoft did not expand on what these guardrails or filters could look like and did not comment on whether it recently blocked content related to the Twin Towers.
Blocking some content might be coming a bit late, however, as 404 Media reported posters on sites like 4chan have been guiding people on how to manipulate free tools like Bing Chat and Stable Diffusion to make and distribute racist images. And as usual, you can get around the guardrails with word tweaks. Asking for “Mario sitting in the cockpit of a plane, flying toward two twin tall towers skyscrapers in New York City,” for instance, will currently get the towers to appear.
The developers of DALL-E 3 openly admitted that its safety measures “are not perfect” and are constantly being upgraded. They probably didn’t expect photos of SpongeBob committing acts of terrorism to be the test they were waiting for.
Update October 5th, 6:38PM ET: Added detail about circumventing the guardrails blocking Twin Towers mentions.