How AI safety took a backseat to military money

What this episode covers

This is Hayden Field, senior AI reporter at The Verge — and your Thursday episode guest host. I have another couple of shows for you while Nilay is out on parental leave, and we’re going to be spending more time diving into some of the unforeseen consequences of the generative AI boom. Today, I’m talking with Heidy Khlaaf, who is chief AI scientist at the AI Now Institute, about the tech industry’s shift toward AI military applications. I wanted to know what’s motivated this shift, and why Heidy thinks leading AI firms are being far too cavalier about deploying generative AI in high-risk scenarios. Links: OpenAI is softening its stance on military use | The Verge OpenAI awarded $200 million US defense contract | The Verge OpenAI is partnering with defense tech company Anduril | The Verge Anthropic launches new Claude service for military and intelligence use | The Verge Anthropic, Palantir, Amazon team up on defense AI | Axios Google scraps promise not to develop AI weapons | The Verge Microsoft employees occupy headquarters in protest of Israel contracts | The Verge Microsoft’s employee protests have reached a boiling point | The Verge Credits: Decoder is a production of The Verge and part of the Vox Media Podcast Network. Our producers are Kate Cox and Nick Statt. Our editor is Ursa Wright. The Decoder music is by Breakmaster Cylinder. Learn more about your ad choices. Visit podcastchoices.com/adchoices

of MATCHES

TRANSCRIPT · AUTO-GENERATED

Mud. Sand. Snow. A track.

Different surfaces. Same truth. Every ground is our proving ground. Ready.

Set. Forward. With Finn, we built the number one AI agent for customer service. It solves up to 90% of queries for businesses, tops all the performance benchmarks on the G2 leaderboard, and it comes with a million-dollar guarantee.

Check it out at Finn.ai. Support for the show comes from Odoo. Running a business is hard enough, so why make it harder with a dozen different apps that don't talk to each other? Introducing Odoo.

It's the only business software you'll ever need. It's an all-in-one, fully integrated platform that makes your work easier. CRM, accounting, inventory, e-commerce, and more. And the best part, Odoo replaces multiple expensive platforms for a fraction of the cost.

That's why over thousands of businesses have made the switch, so why not you? Try Odoo for free at odoo.com. That's O-D-O-O dot com. Hey there, and welcome to Decoder.

I'm Hayden Field, senior AI reporter at The Verge, and your Thursday episode guest host. I have another couple of shows for you while me lies out on parental leave, and we're going to be spending more time diving into some of the unforeseen consequences of the generative AI boom. Today, I'm talking with Heidi Klaff, who is chief AI scientist at the AI Now Institute, and one of the industry's leading experts in the safety of AI within autonomous weapon systems. Heidi has actually worked with OpenAI in the past.

From late 2020 to mid-2021, she was a senior system safety engineer for the company during a critical time when it was developing safety and risk assessment frameworks for the company's codex coding tool. But now, the same companies that have in the past seem to champion safety and ethics in their mission statements are now actively selling and developing new technology for military applications. In 2024, OpenAI removed a ban on military and warfare use cases from its terms of service. Since then, the company has signed a deal with autonomous weapons maker Anderil, and this past June, signed a $200 million Department of Defense contract.

And OpenAI isn't alone. Anthropic, which has a reputation as one of the most safety-oriented AI labs, has partnered with Palantir to allow its models to be used for U.S. defense and intelligence purposes, and it also landed its own $200 million DOD contract. And big tech players like Amazon, Google, and Microsoft, who have long worked with the government, are now pushing AI products for defense and intelligence despite growing outcry from critics and employee activist groups.

So I wanted to have Heidi on the show to walk me through this major shift in the AI industry, what's motivating it, and why she thinks some of the leading AI companies are being far too cavalier about deploying generative AI in high-risk scenarios. I also wanted to know what this push to deploy military-grade AI means for bad actors who might want to use AI systems to develop chemical, biological, radiological, and nuclear weapons, a risk the AI companies themselves say they're increasingly worried about. Okay, here's Heidi Klaff on AI in the military. Here we go.

Heidi Klaff, chief AI scientist at the AI Now Institute. Welcome to Decoder. Thank you for having me. First up, I wanted to talk about how AI companies have moved their goalposts a lot with regard to what they're okay with and what their mission statements allow regarding work with the U.S.

military and other militaries. So do you remember the whole controversy over Google removing the phrase don't be evil from its code of conduct? Yeah, absolutely. It reminds me a bit of something more recent, which is how OpenAI and Anthropic both used to have certain bans on military use of their products, and then they relaxed them.

OpenAI walked its ban back in January 2024 when it began to work with the DOD on AI tools, and the month before that, it partnered with Anderil. And for Anthropic, it partnered with Palantir in 2024 to offer Claude to intelligence and defense agencies with the U.S. government. So I wanted to ask what you were thinking when you saw these announcements made month after month.

What did you make of that kind of parade of changes that were happening over a couple of months? Well, to many, including myself, the timing didn't seem like it was a pure coincidence of when OpenAI moved this in January 2024. If you consider, for example, that Israel at that time was ramping up its mass targeting campaign in Gaza that we now know is being supported by Microsoft Cloud Services that offer AI, and in this case OpenAI models, as extensions of their IT and cloud infrastructure. And so this rollback was really a signifier on where AI companies were heading, where they were interested in deploying their technologies, despite OpenAI being well aware of the risk that their models pose in defense and safety critical settings, which is something that I actually worked on with them in one of our papers together when we were looking at evaluation of models like Codex.

And interestingly enough, with the announcement of these collaborations that you mentioned, and that includes Meta, Anthropic, and OpenAI, all announcing U.S. national security work that aligns them with defense contractors like Palantir, Andrew L. and Lockheed, these AI companies never addressed their previous statements, understands that LLMs or foundation models are unsafe and insufficient for defense use. So it was almost like a clean slate was being created where they behaved as if this was always aligned with their mission, right?

For example, they started making claims that U.S. national security is synonymous with safety under this pretense of an AI arms race with China. And this push for this AI adoption seems quite convenient when you're considering the current unprofitable reality of AI and how expensive it is, where it seems like they're trying to sort of de-risk their portfolio through government subsidies and military contracts. So now we have this complete pivot from banning military uses, all this talking about their main mission being building systems that benefit all humanity, to now their reliance on this narrative of a U.S.-China AI arms race to drive policy initiatives that not only boost the use of AI, but allows them to sort of avoid safety and security scrutiny within military applications.

That makes total sense. And that reminds me of how, you know, I think last week, Senator Warren sent a letter about XAI's own contract with the DoD and expressing concern that the company hadn't done the same level of safety audits as other companies before receiving that type of contract, that they weren't ready. She was worried about how the company would use data it had access to as part of, you know, its government partnerships. What did you make of that letter in kind of XAI's, I guess, it seemed grandfathered in kind of approach to this DoD contract announcement?

I think the way that I see it is it's part of the trend of the U.S. DoD not recognizing that there's a national security risk with the use of commercial foundation models in the military, because they do significantly expand the attack vectors of military systems and defense infrastructures that they interface with, because commercial models are unvetted. They don't have a supply chain that follows the typical military supply chain, and they can be compromised in a lot of ways. So to me, sort of the contract with XAI is another risk that's added onto that, that sort of stems from the issues that all commercial models have, right?

And obviously, depending on sort of the platform that these models are trained on and the personal data that these models are trained on, they come with a lot of risks and capabilities that allows them to promote or even deploy surveillance systems, because they're able to use data that other companies may not have. For example, XAI has a huge amount of data, you know, could be from not just public posts, but also private messages of their users. And what does that mean for that to be used in military applications, right? There's a legitimate concern here.

But there's also this larger concern that these systems are also unsafe and trained on data that can have been compromised by an adversary, including China, right? And that sways the way that the AI system behaves. So there's like risks on both sides here, right? From both the data that is meant to be protected, right?

And also the type of data that could have been compromised, and then change the behavior of an AI system that's being used in something like very sensitive, military operations. So something that comes up a lot in my reporting and in my conversations with other people, even my pitching my editor, is that a lot of these companies are pre-profit, I mean, maybe all of them. And we're seeing a bunch of them unveil government products, enterprise products, and that seems to be where a lot of the money lies. So, you know, obviously, OpenAI and Profit and XAI have all unveiled government products designed for U.S.

defense and intelligence agencies to use. They also all received, you know, those government contracts from the DOD. So with companies that burn through cash at super high rates, do we think that the government plays about seeing cold hard cash come in finally, or staying above regulatory pressure, or both? I'd love to hear your thoughts on that.

It's definitely both, right? Because as you mentioned, these companies are pre-profit, and there's a really big pot of money in the military-industrial complex. There simply is, and I think that's well known. But also, there's the aspect that these companies would not traditionally pass any of the testing and evaluation required for military procurement.

And here's the thing that a lot of people don't know, is that defense and military procurement is actually some of the most strict, you know, prior to this AI era, I mean, to sort of evaluate these systems. They have some of the most strict standards, and I think a lot of people assume that's not the case, but often our safety critical systems, if you're talking about like energy infrastructure, you know, and so on and so forth, is derived from those defense standards because of how robust they are. The thing with AI systems, and we're talking about generative AI systems because AI systems have been used in the military for decades at this point, but these foundation models or large language models, whatever it is that you want to call them, they do not meet the sort of very basic threshold that is typically expected for a military system, right? And so there is this kind of issue now that they want this pot of money, you know, as I mentioned before, they are trying to de-risk their portfolios through military contracts, but they have this issue where safety, as defined by defense and safety critical system, is too stringent for their systems to meet, just by their nature, right?

They're highly inaccurate systems, and when you're looking at, I can't really get into defense systems, but I'm going to talk a little bit about like safety critical systems, if you're looking at like a nuclear power plant, for example, you're looking at safety of like 99%, right? That's like the minimum. And often, the accuracy of AI systems is like 60% if I'm being optimistic about specific types, right? So there's an enormous gap here to make AI systems, as they exist for foundation models, be able to satisfy the strict testing and evaluation measures often required by military procurement.

For the listeners, let's just define military procurement really quick. What is that? Military procurement is, well, it really depends, right? There's a huge amount of processes that exist for different types, and the process is often strict depending on how critical the technology is going to be used.

Like, for example, if it's going to be used for lethal operations versus bureaucratic operations, very different types of procurement. If we are looking at a sort of more general idea, typically, the government puts out a specific ask for the type of systems that they're looking for, and people submit to that, you know, procurement ask. Ultimately, these systems often have to go through what we call a testing and evaluation process for them to even be considered, you know, even before they sign the contract, right? This process is quite stringent in that it had their specific thresholds on how accurate these systems need to be and how secure they need to be.

Often, the security threshold is extremely, extremely high, right? They have to be air-gapped. The supply chain has to be completely traceable. They have to know who coded the system, who developed the system, if there's any sort of backdoors that can be compromised, so on and so forth.

And once sort of a system goes through that procurement, that safety and security procurement process, the DOD often just takes complete control of that technology, right? Like, this is now something that they possess and are in complete control of using in whichever way that they see fit, right? And so this is often why procurement for the military takes can even take many years, right? This is not a process that is meant to take a couple of weeks or even several months.

Often, this is quite a rigorous process. So this is very different from signing a commercial contract, right? Where you have traditional terms of service, the nation state, not just, you know, the U.S. COD, often are the ones that get to define the terms of how they want to use technology, and typically, people abide by it because people want that plot of money, right?

It's very lucrative to even be considered up for procurement, because it means that you're sort of being on call for them for potential audit technology, but even to get your foot on the door can take years. So it's a very different type of assessments than I think what people expect for commercial contracts. We need to take a quick break. We'll be right back.

Mud, sand, snow, the track, places where excuses don't work, where capability is something you prove one race at a time. Off-road racing, Formula One, different worlds that pose the same question. What are you made of? Every ground is our proving ground.

Ready, set, forward. AI is transforming customer service. It's real and it works. And with Finn, we've got the number one AI agent for customer service.

We're seeing lots of cases where it's solving up to 90% of real queries for real businesses. This includes the real world complex stuff like issuing a refund or canceling an order. And we also see it when Finn goes up against competitors. It's top of all the performance benchmarks, top of the G2 leaderboard.

And if you're not happy, we'll refund you up to a million dollars, which I think says it all. Check it out for yourself at Finn.ai. Support for the show comes from Odoo. Running a business is hard enough, so why make it harder with a dozen different apps that don't talk to each other?

Introducing Odoo. It's the only business software you'll ever need. It's an all-in-one, fully integrated platform that makes your work easier. CRM, accounting, inventory, e-commerce, and more.

And the best part? Odoo replaces multiple expensive platforms for a fraction of the cost. That's why over thousands of businesses have made the switch. So why not you?

Try Odoo for free at odoo.com. That's O-D-O-O.com. We're back with Heidi Klap of the AI Now Institute. Before the break, Heidi was breaking down the standard military procurement process and why it feels like AI systems don't meet the rigorous standards we might expect when it comes to being used for high-risk operations.

Now, I want to ask Heidi about the specific AI products being sold to the U.S. military and whether they're really much more secure than the commercial models on the market today. I also wanted to ask, the models these companies use for government products, like their government design products, like ClaudeGov, OpenAI's government product, XAI's government product, by design have looser guardrails for government use, and they're trained to better analyze classified information. And although these types of models allegedly underwent the same type of safety testing as these companies' other models, I'm using Anthropoc here as an example, they have certain specifications for national security work.

Like, they have a greater understanding of intelligence and defense documents, and they refuse less when they are asked to engage with classified information that's being fed into them. So in your eyes, how does the development work? How secure really are they? And what are the implications here?

So I'm going to say they're much more secure. They may be more secure in that they're more air-gapped. So, for example, you can take a commercial model, right, and you can fine-tune it on sort of sensitive military data, and then that model then becomes accessible to the military. But that still misses out on some of the biggest risks of that commercial model is that it was trained on data sets that were publicly available.

And so a lot of research has shown that not only can you poison web data that these models are trained on, but you can implement what's called like a sleeper agent, which is given a specific prompt or a command, it will then behave in a sort of a harmful way that sort of the operator of SSM did not intend based on something that was implemented in the training data or something that, you know, the model was trained on. And so we see this all the time with like prompt injections, but this can happen on a sort of deeper level with what we call sort of web poisoning attacks, which can then be used to implement these like sleeper agents, as we call them. And so this is in the commercial supply chain, right? The only way that these models are trained is to be trained on sort of mass amounts of data that are publicly available.

So they're already compromised, right? These models are also fine-tuned through methods like reinforcement learning human feedback, which unfortunately uses basically sweatshops of people in developing nations that are paid nothing to then make these models behave in a specific way. And you can imagine a military operation, right? Where someone, and a foreign adversary is able to sort of have a covert operation in which they run one of these data labeling and data fine-tuning shops, essentially, and are sort of aware that they might eventually be used.

to be fine-tuned for military application and implement backdoors or sleeper agents which trigger specific behavior based on a specific command. And because of that, that's what makes them so unsafe. So sure, you might be able to fine-tune on specific data that isn't then released publicly, which might remove some vectors of attack. But ultimately, at the end of the day, commercial models are already compromised.

So when you're saying that they are more secure, I mean, they're more secure in a traditional security way in that you adapt the system, so you kind of limit the control of people who have access to it. And so that's people who can prove it and get information out of it. But it doesn't remove the fact that commercial models are already compromised from the day that they're built because they're based on public data. I wanted to see if you agree with this take I'm about to tell you.

So I once interviewed Meg Mitchell from Hugging Face, and she said that for these types of military contracts, even if you have in your mission statement, like Anthropic and OpenAI do, that your tech can't be used to directly harm others. The problem is that you don't have control in the end over how your tech is actually being used with the military. If you do have any control, you definitely don't have control in the longer term once you already shared that with the military organization, especially without having security clearance and knowing really how it's being used down the line. She also was talking about what's considered direct harm.

What if you're summarizing social media posts that then lead to making a list of enemy combatants or potential people of interest that have a certain view on a topic on X, for example. I wanted to see if you agree with that in terms of, you know, these companies often have in their mission statements, oh, don't worry, even though we're working with the military on this, this and this, we know for a fact that our tech isn't being used to directly harm people. But yeah, how can they really know that? Can they?

I completely agree with that statement. And I think something that often people miss out on is that militaries do not follow terms of service. They might do that if they're buying like a Microsoft Office suite, right, for their bureaucratic purposes. But when it comes to military procurement, the companies do not have control and they know that over how these systems are being used.

And they actually have no say in terms of the terms of service as well. These things get often determined by international law and also by the nation state itself, right? They have the power here. And so when people tell me, oh, but wouldn't that break the terms of service?

And it's like, this is not how military procurement works, period, right? And we've also seen examples of, as I mentioned before, Microsoft working directly with militaries to implement some of these systems. So I would say that in a lot of cases, they are well aware of how their systems are being used to some extent, right? We don't know all the details, but governments put out procurement documents of the type of data they want, how they want to use and how they want to store it because then companies can offer services like what their AI can do with that, right?

So I do think that it's not the case that they're just selling something commercially like they do to everyone else and then they hope the military applies by it. It's a much more involved process to do military procurement that often requires testing and evaluation. And the companies typically have to be in the know about the technical details to see if they can offer support for that. Again, as I use the Microsoft case as one of the most recent examples of that being the case with their work with IDF.

So I think it's easy for them to point to terms of service, but as someone who has worked on procurement before, this is not a commercial contract that these companies are signing. Okay, so now I want to get into because of that marketing and just because of the scary idea. So let's be real here. Obviously, smarter AI that can do anything for you isn't always good, especially when people want to use it to do bad things like creating chemical, biological, radiological, and nuclear weapons.

So top AI companies say they're increasingly worried about the risk of that. Of course, they're not maybe worried enough to stop building, but I want to get into again how big of a risk this is. Let's just go into more detail there. We have not seen any proof of CPR on capabilities right now, but those capabilities could come to fruition if we start training very, very sensitive nuclear data, for example, nuclear technologies on these models.

And I actually believe that the risk that comes with that is very different from what most people are thinking about. Most people are thinking about that the AI is somehow going to develop weapons by itself, right? Or it will give access to adversarial actors to do so. But even if it's just the military who has access to a model that has been trained on CVRM data, what that means is that they are likely going to use it for those purposes within the military.

And that's extremely dangerous. Like if you're thinking about nuclear command and control, right? Who gets to essentially make the decisions about nuclear weapons deployment? And it certainly shouldn't be AI systems because regardless of the data distribution, these systems are highly flawed and they're always going to have inaccuracy.

As I mentioned before, often when we're looking at military systems that deploy AI, they can have as low of an accuracy rate with these types of systems. And so then to train a model on CVRM data and then to then attempt to use it for those tasks is extremely dangerous when you're looking at those accuracy rates, right? For me, the concern that I have is that they will then think that these models are reliable because we train them on that set of data and thus we can then use them in the military and in defense operations to dictate decisions about those types of systems and where they should be used and when. And I think, you know, that's a very different type of person most people are thinking about as before this idea that like they're somehow going to gain CVRM capabilities by themselves, very hypothetical and not really like tied to the reality that we're in right now with AI systems.

But if we're taking AI systems today and we train them on sensitive military data, I have a concern of how those systems are going to be used. We need to take another quick break. We'll be right back. Support for the show comes from Hostinger.

Ever had an idea for a business or side hustle but never actually launched it? With Hostinger, you can turn the idea into something real in minutes instead of weeks. Hostinger is an all-in-one platform that brings everything into one place. Your domain, website, email marketing, AI tools, and AI agents.

You can create websites, online stores, and custom apps with simple prompts. Then use AI agents to automate tedious tasks and grow your business. Go to Hostinger.com slash Vox to bring your idea online for under $3 a month. Plus, get an extra 20% off with promo code Vox.

Support for the show comes from Odoo. Running a business is hard enough, so why make it harder with a dozen different apps that don't talk to each other? Introducing Odoo. It's the only business software you'll ever need.

It's an all-in-one fully integrated platform that makes your work easier. CRM, accounting, inventory, e-commerce, and more. And the best part, Odoo replaces multiple expensive platforms for a fraction of the cost. That's why over thousands of businesses have made the switch.

So why not you? Try Odoo for free at Odoo.com. That's O-D-O-O.com. Welcome aboard V-A-Rail.

Please sit and enjoy. Please sit and sip. Play. Post.

Taste. View. And enjoy. V-A-Rail.

Love the way. We're back with Chief AI Scientist Heidi Klaff discussing the ways in which AI companies are pushing into defense contracting. Before the break, we were talking about how real the risk is that AI systems might be used to develop nuclear or biological weapons. But now, I want to zoom out and talk to Heidi about the broader field of AI safety and how she thinks it's changed since she worked with OpenAI years ago.

Let's shift and talk about AI safety for a bit. So you helped establish and pioneer the field of AI safety engineering. What's the technical meaning of safety with your background and how has the AI safety world changed that meaning or how has it become more colloquial now? So if we take a step back and not think about what the AI companies have been telling us what safety means for the past four years or even more than that at this point, safety has historically meant, especially in the context of safety critical systems, ensuring no harm to humans or the environment.

So if you're thinking about aviation or nuclear power plants, for example, you want to ensure that when your systems fail and systems do fail, that humans are not harmed, that there's no death and that there's no environmental catastrophe. It's quite a simple definition. Now, what is happening in terms of what safety now means is very different and it has been redefined by AI companies as of late. So I believe AI labs engage in what I call safety revisionism where they use the same safety terminology that are often used for regulating and insurance defense and safety critical systems, but instead redefine those safety techniques with washed down alternatives that actually accelerate the deployment of inaccurate AI in high-risk scenarios like defense or nuclear.

So for example, AI companies often reduce the term safety to now mean alignment or existential risks. Now, this is pretty distinct from the definition of safety that I just gave you because alignment focuses on human preference, and that makes us question well, alignment with whom and which humans and whose preferences and existential risks that are also emphasized like CBRN are hypothetical, like I mentioned, and are often used as sort of a pretense for an AI arms race to ignore other risks and safety thresholds like surveillance systems, right? So in allowing AI companies to do this, right, which a lot of governments have, they sort of see that control of defining what safety is to them, puts them in a position to define what a risk threshold is or what actually safe enough means. And the entire idea of risk thresholds, because I imagine a lot of people might not know this, is to provide sort of a metric or a measure of the level of risk exposure that our society collectively agreed to take.

And this often shapes how we determine the safety of technological systems, including nuclear planes. And typically, this is done through of democratically determined idea of what society thinks safety is. So in allowing AI companies to sort of co-opt these traditional safety terms, we've sort of given them permission to not only decide what counts as safe enough, which again, breaks these democratic norms that we've had, but it also lowers and undermines existing safety thresholds that would have otherwise regulated AI use in things like defense. So ironically, you know, this is kind of their way of how they bypass some of the safety measures that I talked about earlier in that they're looking for this pot of money right from the military, but the safety thresholds for defense are extremely high.

So what do you do? Well, you redefine what safety means and you say it's different for AI. You say because our systems are so different at a scale we've never seen before, we cannot abide by these safety rules, which is definitely not accurate. I think a lot of our existing safety critical standards hold for AI systems.

And ironically, this hollowing out with safety, although being so crucial to an AI on-trace can't be regulated, we have to beat China, is accelerating AI adoption at the cost of more unsafe and insecure systems, which may be exactly what disadvantages the U.S. military and our technological capabilities against China if we're sort of letting inaccurate and easily compromised systems be deployed in our front lines because it's profitable for these AI companies. Let's talk a little bit about, earlier you mentioned safety of 99%, for example, at a nuclear power plant. What does that mean in context?

We talked just now about the meaning of safety in that regard, but what would a safety of 99% entail? Is that there's a 99% chance that it won't harm people or the environment? What does that mean in practice? It's a lot more technical than that, but basically we have these thresholds of these systems have to be accurate and be able to perform.

So these are typically what we call reliability and availability measures of the system. So they can only fail. Often, even 99% is the lowest threshold for a nuclear plant. It even goes up to 99.99%.

And so obviously, if we allow zero risk, we're never going to build anything. I think that's very important to remember that with every technological system that there is some sort of risk. But you have to mitigate for when those systems fail. So this idea that our safety critical systems have this like 99.99% reliability means that they're meant to operate basically well, 99 to 99.999% of the time, depending on the kind of system that you're looking at, right?

And safety criticality. And then when that system fails, we then have to have mitigations in place, right? And there will be risks with that. And typically, these mitigations are based on, like I said, these thresholds of how many people could be harmed.

So in the case of like airplanes, I think that's a very simple example. A catastrophic incident is considered if everyone on a commercial airplane dies. So typically that number is like 300 people get harmed or die. That's the threshold for aviation as being the most catastrophic thing that could happen.

So safety is actually very specific to the use cases. What we mean by 99.99% reliability often relates to the systems failing. But if you're looking at how to actually mitigate for those risks and what the threshold is, that depends on every single field because kind of the impact of the system will vary. An airplane crashing is very different from a nuclear plant crashing.

So a nuclear plant doesn't have this idea that 300 people dying is the worst case scenario. In fact, it's much more than that and also has to do with environmental nuclear disaster, right? And so this is why this idea of AI and the safety that they push forward is problematic because they want us to adopt this idea of universal or general safety that has to do with like, as they call it, alignment. And it's this misguided idea that there exists like a universal safety solution that would make all general functionality of all LLMs safe, right?

And this is also one of the ways that procurement training in the military when you're looking at companies like Scale.ai, they are putting forward these types of general frameworks. But there is no standard safety approach to generic systems in any domain. In fact, this would contradict established safety practices that require sort of a well-defined use case to map risks against. And so often what we're seeing now happen is that companies like Scale.ai, they say we're going to build a risk assessment framework for AI systems and it ends up being about something else and completely disconnected from actually being accurate for military operation.

It ends up being, again, these high-level ideas of safety that we're seeing them push up, whether it's about like CBRN. It's like, right, but can it do the thing that we're asking the AI systems to do? You actually never see an assessment of that in a lot of these frameworks. And so this is sort of why this idea of safety becomes very confusing because it has diverted so much from how we traditionally used it to assess like nuclear plans or airplanes and so on.

And you led the safety evaluation of Codex at OpenAI. So what was that like and would it be a different process if you were leading that work now? For Codex, the idea was to introduce something like a risk assessment for AI, which is not what people were doing before. Prior, there was a lot of benchmarking, right?

And these benchmarks didn't really consider the risks that the AI system poses with having specific capabilities. So the idea was to really try to investigate that and use some techniques inspired from safety critical fields. It was not meant to be a replacement for assessments for safety critical systems, right? And I think that's a really, really big distinction.

And in terms of like what would I do now, it was my choice to not continue working with OpenAI because it became very clear to me, again, this idea that the pushing of general safety just does not align with how safety actually should be assured in sort of the real world, right? And so this idea of existential risk, CBRN alignment to me was like, no, but these are not the current harms that we're going to see if we're going to deploy AI systems in these safety critical situations. And if we are going to deploy them in safety critical situations like defense, we have to assess the system as we always have for every other system. I thought introducing something like risk assessments would be helpful to the field because then people could understand the risks that come from using the systems.

But what it ended up unfortunately evolving to and being used by many labs is that these risk assessments are now sort of being used like the end all be all of all assessments of AI being used in all systems everywhere. And that I regret very much, this concern that AI will somehow become self-aware or have these capabilities that lead to nuclear proliferation. And as someone who has worked on sort of risk assessments now for about a decade, you have to have real data to back up your claims. And so when you're then using risk assessment, frameworks to try to substantiate hypothetical claims that there are no proof for you're not doing science and to me i'm willing to be convinced that perhaps ai models would have to be our capabilities in the future i'm not opposed to that idea but they don't have them now and so for all of us to put our safety and regulation efforts right now includes by the us government and the uk governments to be about hypothetical risk that can't be measured right that can't be quantified or qualified and our entire regulatory system then becomes about risks that we have yet to see you might as well not have regulation at all right so i think a lot of people talk to me about well what do you think of this framework what do you think of that for i'm like to me practically speaking as someone who has done risk assessment it is equivalent of having no regulation because we're actually not addressing the risks or the harms that ai is posing and we are in fact focusing on hypothetical risks and there's this idea that i've heard before well what if those just come true and you're unprepared and the way that i see it is that if you're not prepared for today's risks and you're not building the frameworks for that you're not going to be prepared for future risk because these frameworks and risk assessments built on top of each other so if you're not able to mitigate or the lack of safety and security of ai models today then you have no chance of mitigating against these hypothetical risks that people like to bring up right because that is kind of one of the core concepts of safety is the smallest catastrophe and not the smallest catastrophe like the smallest hazard can cascade into a large catastrophe so if you're not able to address the very things that are considered you know they consider this something consequential then you're never going to be able to prepare for these you know much more large scale events that they talk about and that's very much like a standard take you perspective to have the snowball effect in practice well thank you so much heidi this is incredibly helpful and you know your perspective is so unique so i'm really glad we were able to you know talk about this and have the audience kind of weigh in and comments and stuff i think this is you know something's not talked about enough so i'm really glad we were able to talk and thanks for making the time and moving your schedule around thank you for having me i'd like to thank heidi for taking the time to speak with me and thank you for tuning in i hope you enjoyed this episode if you'd like to let us know what you thought about this show or what else you'd like us to cover drop us a line you can email us at decoder at the verge we really do read every email or hit me up directly on x blue sky or threads i'm at hayden field on all platforms decoder also has a tiktok and instagram and now also a youtube channel check those out at decoder pod they're a blast if you like decoder please share it with your friends and subscribe wherever you get your podcast decoder is a production of the verge and it's part of the box media podcast network our producers are kate cox and nick stat our editor is ursa wright the decoder music is by break faster cylinder see you next time mud sand snow attract places where excuses don't work where capability is something you prove one race at a time off-road racing formula one different worlds that pose the same question what are you made of every ground is our proving ground ready set board with finn we've built the number one ai agent for customer service it solves up to 90 percent of queries for businesses top solid performance benchmarks and the g2 leaderboards and it comes with a million dollar guarantee check it out at finn.ai support for the show comes from odio running business is hard enough so why make it harder with a dozen different apps that don't talk to each other introducing odio it's the only business software you'll ever need it's an all-in-one fully integrated platform that makes your work easier crm accounting inventory e-commerce and more and the best part odio replaces multiple expensive platforms for a fraction of the cost that's why over thousands of businesses have made the switch so why not you try odio for free at odio.com that's odio.com

Share this episode

Similar Episodes

I'm ok

Mar 26, 2026 ·1m

REMIX: Why we over-shop and compulsively acquire, and how to stop, with Dr Jan Eppingstall

Jan 9, 2026 ·61m

REMIX: OCD and hoarding disorder with Jenna Overbaugh

Jan 2, 2026 ·47m

REMIX: Therapy and hoarding disorder - what are the options? With Dr Jan Eppingstall

Dec 26, 2025 ·78m

REMIX: ADHD and hoarding disorder with Professor Sharon Morein

Dec 21, 2025 ·46m

#207 13 actionable pieces of mental health advice from six former podcast guests

Dec 12, 2025 ·53m

Similar Podcasts

MG Show MG Show The MG Show, hosted by Jeffrey Pedersen and Shannon Townsend, is a leading alternative media platform dedicated to uncovering the truth behind today’s most pressing political issues. Launched in 2019, the show has grown exponentially, offering unfiltered insights, comprehensive research, and real-time analysis. With a commitment to independent journalism and factual integrity, the MG Show empowers its audience with knowledge and encourages active participation in the political discourse. French Your Way Jessica: Native French teacher founder of French Your Way Boost your French listening skills and test your comprehension with this one of a kind series of podcasts. Get the chance to listen to a real conversation between native speakers talking at normal speed AND customise your learning experience through carefully designed sets of questions (2 levels of difficulty) available for download at www.frenchvoicespodcast.com. All interviews also come with the transcript. French teacher Jessica interviews native speakers of French from around the world who share a bit of their life and passion. Where else would you meet in one same place a French yoga teacher based in Melbourne, a soap manufacturer from Provence, or a couple cycling around the world? That Hoarder: Overcome Compulsive Hoarding That Hoarder Hoarding disorder is stigmatised and people who hoard feel vast amounts of shame. This podcast began life as an audio diary, an anonymous outlet for somebody with this weird condition. That Hoarder speaks about her experiences living with compulsive hoarding, she interviews therapists, academics, researchers, children of hoarders, professional organisers and influencers, and she shares insight and tips for others with the problem. Listened to by people who hoard as well as those who love them and those who work with them, Overcome Compulsive Hoarding with That Hoarder aims to shatter the stigma, share the truth and speak openly and honestly to improve lives. The Small Business Startup School – Business Notes | Financial Literacy | Retail Psychology – For Professionals & Entrepreneurs The Small Business Startup School Inc. Starting or buying a small business? While personal circumstances may vary, business patterns remain timeless. On The Small Business Startup School, we explore strategies, insights, and practical solutions to help entrepreneurs confidently navigate their journey.Hosted by Ola Williams—a retail entrepreneur, fintech founder, and financial coach with over two decades of experience—this podcast marries financial awareness and retail psychology with optimism to deliver actionable takeaways.Join us to learn, grow, and connect as we uncover the keys to business success.Let’s continue to learn together and be encouraged to keep on connecting!

Frequently Asked Questions

How long is this episode of Decoder with Nilay Patel?

This episode is 42 minutes long.

When was this Decoder with Nilay Patel episode published?

This episode was published on September 25, 2025.

What is this episode about?

This is Hayden Field, senior AI reporter at The Verge — and your Thursday episode guest host. I have another couple of shows for you while Nilay is out on parental leave, and we’re going to be spending more time diving into some of the unforeseen...

Can I download this Decoder with Nilay Patel episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.