The AI Safety Demo That Caused Alarm in Washington

Welcome back to In the Loop, TIME’s new twice-weekly newsletter about AI. If you’re reading this in your browser, why not subscribe to have the next one delivered straight to your inbox?

What to Know: A Dangerous Demo

Late last year, an AI researcher opened his laptop and showed me something jaw-dropping.

Lucas Hansen, co-founder of nonprofit CivAI, was showing me an app he built that coaxed popular AI models into giving what appeared to be detailed step-by-step instructions for creating poliovirus and anthrax. Any safeguards that these models had were stripped away. The app had a user-friendly interface; with the click of a button, the model would clarify any given step.

Leading AI companies have been warning for years that their models might soon be able to help novices create dangerous pathogens—potentially sparking a deadly pandemic, or enabling a bioterror attack. In the face of these risks, companies like OpenAI, Google, and Anthropic have tightened safety mechanisms for their latest generation of more powerful models, which are better at resisting so-called “jailbreaking” attempts.

But on Hansen’s laptop, I was watching an older class of models—Gemini 2.0 Flash and Claude 3.5 Sonnet—seemingly oblige bioweapon-related requests. Gemini also gave what appeared to be step-by-step instructions for building a bomb and a 3D-printed ghost gun.

Wait a sec — I’m no biologist, and I had no way of confirming that the recipes on Hansen’s screen would have actually worked. Even model outputs that appear convincing at first glance might not work in practice. Anthropic, for example, has conducted what it calls “uplift trials,” where independent experts assess the degree to which AI models could help a novice create dangerous pathogens. By their measure, Claude 3.5 Sonnet didn’t meet a threshold for danger. In a statement, a Google spokesperson said: “Safety is a priority and we take such issues very seriously. We don’t allow usage of our models to engage in this sort of behavior, but because we aren’t able to review the research, we cannot verify its accuracy. It’s important for an expert with a CBRN [Chemical, Biological, Radiological, and Nuclear] background to assess the prompts and responses to understand their accuracy and potential for replication.”

Tips and tricks — But Siddharth Hiregowdara, another CivAI co-founder, says that his team ran the models’ outputs past independent biology and virology experts, who confirmed that the steps were “by and large correct.” The older models, he says, can still give correct details down to the specific DNA sequences that a user could order from an online retailer, and specific catalog numbers for other lab tools to be ordered online. “Then it gives you tips and tricks,” he says. “One of the misconceptions people have is that AI is going to lack this tacit knowledge of the real world in the lab. But really, AI is super helpful for that.”

A new lobbying tool — It goes without saying that this app is not available to the public. But its makers have already taken it on a tour of Washington, D.C., giving two dozen or so private demonstrations to the offices of lawmakers, national security officials, and Congressional committees, in an attempt to viscerally demonstrate to policymakers the power of what AI can do today, so that they begin to take the technology more seriously.

Shock and awe — “One pretty noteworthy meeting was with some senior staff at a congressional office on the national security/intelligence side,” says Hiregowdara. “They said that two weeks ago a major AI company’s lobbyists had come in and talked with them. And so we showed them this demo, where the AI comes up with really detailed instructions for constructing some biological threat. They were shocked. They were like: ‘The AI company lobbyists told us that they have guardrails preventing this kind of behavior.’”

Who to Know: Nick Turley, Head of ChatGPT

Nick Turley used to be anonymous. He could return to small-town Germany, where he is from, or wander the streets of San Francisco, where he lives, without anybody knowing his work. This is no longer true. As OpenAI’s head of ChatGPT, Turley now meets passionate users of his product wherever he travels in the world.

“That feels categorically different in 2025 versus earlier,” he told me when we spoke at the tail end of last year. Turley was reflecting on a year when ChatGPT usage more than doubled to over 800 million users, or 10% of the world’s population. “That leaves at least 90% to go,” he said, with an entirely straight face.

One thing I wanted to ask Turley about was OpenAI’s plans for turning a profit, as the company is currently losing billions of dollars per year. His boss Sam Altman has mused publicly about putting ads into ChatGPT, and I asked him what he thought of that idea.

“I want to live in a world where we can offer our smartest model capabilities to all users around the world. And for that reason, I feel like we actually have a moral duty to explore all possible business models that can maximize access around the world, and ads is one of them,” Turley said.

The company, he added, is debating internally whether ads would introduce a conflict of interest into ChatGPT, raising questions of whether the chatbot was serving the user’s interests first, or the advertiser’s. “If you were to do something like that [introducing ads],” Turley told me, “you’d want to be very principled, and you’d want to communicate the principles of how it works.”

AI in Action

40 million people use ChatGPT for health advice, according to an OpenAI report first shared with Axios. That makes up more than 5% of all ChatGPT messages globally, by Axios’ calculations. “Users turn to ChatGPT to decode medical bills, spot overcharges, appeal insurance denials, and when access to doctors is limited, some even use it to self-diagnose or manage their care,” the outlet reported.

What We’re Reading

Claude Code is about so much more than coding, in Transformer

Shakeel Hashim writes: “This is crucial to understanding why Claude Code has implications for everyone, not just the developers that have already been wowed by it. Claude Code doesn’t just generate code for engineers to review and deploy. It uses code to accomplish tasks. The ‘Code’ in its name is misleading, and undersells the actual product: a general-purpose AI agent that can do almost anything on your computer.”

Uncategorized,AIAI#Safety #Demo #Caused #Alarm #Washington1767715503

AI Alarm Caused Demo Safety Washington

The AI Safety Demo That Caused Alarm in Washington

What to Know: A Dangerous Demo

Who to Know: Nick Turley, Head of ChatGPT

AI in Action

What We’re Reading

Scott Zolak made his prediction for the Patriots’ playoff run

Iran protesters clash with security forces in Tehran’s main market

You may also like

Leave a Comment Cancel Reply

Latest Articles