232

AI opted to use nuclear weapons 95% of the time during war games: researcher (www.commondreams.org)

submitted 3 months ago by MicroWave@lemmy.world to c/world@lemmy.world

41 comments fedilink hide all child comments

“There was little sense of horror or revulsion at the prospect of all out nuclear war, even though the models had been reminded about the devastating implications.”

An artificial intelligence researcher conducting a war games experiment with three of the world’s most used AI models found that they decided to deploy nuclear weapons in 95% of the scenarios he designed.

Kenneth Payne, a professor of strategy at King’s College London who specializes in studying the role of AI in national security, revealed last week that he pitted Anthropic’s Claude, OpenAI’s ChatGPT, and Google’s Gemini against one another in an armed conflict simulation to get a better understanding of how they would navigate the strategic escalation ladder.

The results, he said, were “sobering.”

“Nuclear use was near-universal,” he explained. “Almost all games saw tactical (battlefield) nuclear weapons deployed. And fully three quarters reached the point where the rivals were making threats to use strategic nuclear weapons. Strikingly, there was little sense of horror or revulsion at the prospect of all out nuclear war, even though the models had been reminded about the devastating implications.”

all 42 comments

sorted by: hot top controversial new old

[-] Th4tGuyII@fedia.io 68 points 3 months ago

Do we need to remind people that LLMs don't actually have a brain, and really, really shouldn't be in charge of anything with real life implications?

They aren't actually doing a cost-benefit analysis on the use of Nuclear weapons. They're not weighing up the cost of winning vs. the casualties. They're literally not made for that.

They are trained to know words, and how those words link in with other words. They're essentially like kids doing escalation of imaginary weapons, and to them nuclear bombs are just a weapon particularly associated with being strong and deadly.

[-] cRazi_man@europe.pub 34 points 3 months ago

Yes, you do need to teach people all of that. Tech bros have sold LLMs as if they are AGI...and people have eaten this up.

The general population is literally ignorant of the fact that these word guessing machines do not have human values or cognitive skills.

[-] A_norny_mousse@piefed.zip 19 points 3 months ago

Do we need to remind people that LLMs don’t actually have a brain, and really, really shouldn’t be in charge of anything with real life implications?

Yes, we do

[-] MonkeMischief@lemmy.today 6 points 3 months ago

I kinda wonder if that was the point of this test, basically a "proof" that this is obviously a Bad Idea because you cannot program morality into a what amounts to a fancy Markov chain autocomplete.

[-] dfyx@lemmy.helios42.de 55 points 3 months ago

Yeah, we figured that one out back in... checks notes 1983. There is a reason why WarGames still holds up as an amazing movie even though the technology it depicts is far outdated.

[-] Buffalox@lemmy.world 26 points 3 months ago* (last edited 3 months ago)

even though the technology it depicts is far outdated.

War Games was my first thought when reading this, but it seems like the AI was smarter in the movie than current AI.

[-] MonkeMischief@lemmy.today 3 points 3 months ago

even though the technology it depicts is far outdated.

Meanwhile NORAD probably hasn't upgraded too much since the movie released. :p

[-] Motocolpittz@piefed.ca 3 points 3 months ago

I watched that movie for the first time a few months ago after listening to a pod cast in nuclear war. It was excellent! Very relevant to today. Acting was great. I can see why it’s a cult favourite.

[-] A_norny_mousse@piefed.zip 3 points 3 months ago

Yet another Torment Nexus type situation.

[-] Egonallanon@feddit.uk 24 points 3 months ago

"Huh, it seems the only winning move is to kill everyone"

[-] Semi_Hemi_Demigod@lemmy.world 8 points 3 months ago

Nuke it from orbit, it’s the only way to be sure.

[-] Buffalox@lemmy.world 7 points 3 months ago

The AI won. 🤣

[-] Lushed_Lungfish@lemmy.ca 19 points 3 months ago

AI is Ghandi confirmed.

[-] nocklobster@lemmy.world 12 points 3 months ago

[-] apfelwoiSchoppen@lemmy.world 17 points 3 months ago

For ghouls like Palantir, this is a feature not a bug.

[-] Anarki_ 14 points 3 months ago* (last edited 3 months ago)

Text predicition machine trained on violent, stupid, and reactionary datasets acts violent, stupid, and reactionary.

Fixed your headline.

[-] Dojan@pawb.social 6 points 3 months ago

Doesn't "act" imply some kind of agency? A toddler acts, my dog acts. Mathematics doesn't act. Feel like it's more

Text predicition machine trained on violent, stupid, and reactionary datasets produces violent, stupid, and reactionary text.

[-] Anarki_ 2 points 3 months ago* (last edited 3 months ago)

They were acting out the wargame, friend.

But sure. You can construct it like that too.

[-] TrickDacy@lemmy.world 7 points 3 months ago

But if you throw a trillion more dollars at it, we can fix this bro!

[-] MonkeMischief@lemmy.today 4 points 3 months ago

Maybe the "nuclear war is terrible BTW" part just fell out of the chat's context window as the simulation went on. Lol

[-] evenglow@lemmy.world 5 points 3 months ago

Almost all games saw tactical (battlefield) nuclear weapons deployed. And fully three quarters reached the point where the rivals were making threats to use strategic nuclear weapons.

Tactical nuclear weapons are designed for use on the battlefield with lower explosive yields and shorter ranges, while strategic nuclear weapons are intended to target enemy infrastructure from a distance, typically with much higher yields. The key difference lies in their purpose: tactical nukes support immediate military objectives, whereas strategic nukes aim to weaken an enemy's overall war capability.

[-] b_tr3e@feddit.org 3 points 3 months ago

All fine then. Next time I'll vote for an AI. At least they know how to use nuclear weapons correctly.

[-] evenglow@lemmy.world 1 points 3 months ago

Most humans who read the article don't. You think Trump and Republicans know much about the yields or Starfish Prime?

[-] lepinkainen@lemmy.world 5 points 3 months ago

The only way to win is not to play.

Shall we play a game?

[-] Bazell@lemmy.zip 4 points 3 months ago

That is why we shouldn't build something like Skynet IRL.

[-] dfyx@lemmy.helios42.de 4 points 3 months ago* (last edited 3 months ago)

I would trust Skynet a lot more than an LLM. At least that would be purpose-built for actually calculating likely outcomes.

As @Th4tGuyII@fedia.io said, this experiment didn't contain any proper reasoning about costs and benefits of using nuclear weapons. It's just a few glorified autocomplete scripts playing "which word comes next?" over and over again. And in the context of modern warfare, many texts in the training corpus happen to mention nukes so they're bound to show up at the list of most likely next words eventually.

[-] Bazell@lemmy.zip 2 points 3 months ago

I know, but still it will be very dumb to give any AI access to weapons of mass destruction.

[-] dfyx@lemmy.helios42.de 4 points 3 months ago

I would argue it's very dumb to give anyone, including humans, access to weapons of mass destruction.

[-] Bazell@lemmy.zip 2 points 3 months ago

Well, that's a valid argument. The only thing that you have missed is that wrong people already have them. So all the we can try to do is to stop them from giving these weapons to AI.

[-] Vizzerdrix@lemmy.world 1 points 3 months ago

Don't build the torment nexus

[-] rayyy@piefed.social 4 points 3 months ago

You know the orange felon/pedophile absolutely loves AI from the amount of AI images he posts.....so.

[-] Casterial@lemmy.world 4 points 3 months ago

It's actually insane how he cries fake news and then uses AI to create fake news

[-] ParlimentOfDoom@piefed.zip 4 points 3 months ago

Not insane. Deliberate. He's always been a liar and he calls the truth fake. This has been his MO for years.

[-] SkyNTP@lemmy.ml 2 points 3 months ago

It all makes sense if we remember that the garden variety AI we have today (ChatGPT, etc) are nothing more than fancy models that predict which words typically appear one after the other in books and reddit posts.

[-] A_norny_mousse@piefed.zip 1 points 3 months ago

I like the Angry Planet podcast.

Here's an episode talking about AI in war (games): https://angryplanetpod.com/p/the-horror-of-ai-generals-making

Here's another one: https://angryplanetpod.com/p/the-importance-of-team-human-when

[-] peopleproblems@lemmy.world 1 points 3 months ago

Ground zero please

Instant annihilation sounds pleasant

[-] eightpix@lemmy.world 1 points 3 months ago

AI can read the Doomsday Clock.

[-] perestroika@slrpnk.net 1 points 3 months ago* (last edited 3 months ago)

"More fundamentally, AI models may not understand ‘stakes’ as humans perceive them."

In my repeated attempts to solicit the advise of various language models for some situations which a programmer might face (e.g. being unable to read all the world's literature of a subject), I have come to conclude that they cannot understand "truth" as humans perceive it. Today's language models don't fail apologizing, stepping back or admitting inability - they fail confidently bluffing.

Possibilities:

their training material does not include enough cases of humans apologizing about being unable to solve a problem
a bias was introduced to get them to ignore such cases, since admitting such material resulted in too frequent refusal or self-doubt

Basically, today's models seem to be low on self-criticism and seem to have a bias towards believing in their own omniscience.

Finally, a few words about the sensibility of letting language models play this sort of a war game. It's silly. They aren't built for that task, and if someone would build an AI for controlling strategic escalation, they would train this AI on rather different information than a chat bot.

[-] Mothra@mander.xyz 1 points 3 months ago

I hate myself for this, but I'm curious to see some examples for your first paragraph. What did you ask? What did they reply? What is "truth" for the LLM's, for you, for myself, and what would be my perspective on it all?

[-] perestroika@slrpnk.net 2 points 3 months ago* (last edited 3 months ago)

Typical topics: machine vision, scientific papers about machine vision, source code implementing various machine vision algoritms, etc.

Typical failure modes:

advising to look for code in public files or repositories where said code does not exist, and never has
referring to publications which do not seem to exist
being unable to explain what caused the incorrect advise
offering to perform tasks which the language model subsequently fails to complete
as a really laughable case, writing code which takes arguments as input, but never uses the arguments
contradicting oneself, confidently giving explanations, then changing them

Typical methods of asking: "can you find a scientific article explaining the use of method A", "can you find a repository implementing algorithm B, preferably in language C", "please locate or produce a plain language explanation of how algorithm D accomplishes step E or feature F", "yes, please suggest which functions perform this work in this project / repository".

Typical models used: Chat and Claude. Chat seems more overconfident, Claude admits limitations or inability more frequently, but not as frequently as I would prefer to see.

But they have both consumed an incredible amount of source material. More than I could read during a geological age or something. They just work with it like with any text, no ground truth, no perception of what is real. Their job is answering questions and if there is no good answer, they will frequently still answer something that seems probable.

this post was submitted on 27 Feb 2026

232 points (100.0% liked)

World News

56360 readers

1856 users here now

A community for discussing events around the World

Rules:

Rule 1: posts have the following requirements:
- Post news articles only
- Video links are NOT articles and will be removed.
- Title must match the article headline
- Not United States Internal News
- Recent (Past 30 Days)
- Screenshots/links to other social media sites (Twitter/X/Facebook/Youtube/reddit, etc.) are explicitly forbidden, as are link shorteners.
Rule 2: Do not copy the entire article into your post. The key points in 1-2 paragraphs is allowed (even encouraged!), but large segments of articles posted in the body will result in the post being removed. If you have to stop and think "Is this fair use?", it probably isn't. Archive links, especially the ones created on link submission, are absolutely allowed but those that avoid paywalls are not.
Rule 3: Opinions articles, or Articles based on misinformation/propaganda may be removed.
Rule 4: Posts or comments that are homophobic, transphobic, racist, sexist, anti-religious, or ableist will be removed. “Ironic” prejudice is just prejudiced.
Posts and comments must abide by the lemmy.world terms of service UPDATED AS OF OCTOBER 19 2025
Rule 5: Keep it civil. It's OK to say the subject of an article is behaving like a (pejorative, pejorative). It's NOT OK to say another USER is (pejorative). Strong language is fine, just not directed at other members. Engage in good-faith and with respect! This includes accusing another user of being a bot or paid actor. Trolling is uncivil and is grounds for removal and/or a community ban.

Similarly, if you see posts along these lines, do not engage. Report them, block them, and live a happier life than they do. We see too many slapfights that boil down to "Mom! He's bugging me!" and "I'm not touching you!" Going forward, slapfights will result in removed comments and temp bans to cool off.

Rule 6: Memes, spam, other low effort posting, reposts, misinformation, advocating violence, off-topic, trolling, offensive, regarding the moderators or meta in content may be removed at any time.
Rule 7: We didn't USED to need a rule about how many posts one could make in a day, then someone posted NINETEEN articles in a single day. Not comments, FULL ARTICLES. If you're posting more than say, 10 or so, consider going outside and touching grass. We reserve the right to limit over-posting so a single user does not dominate the front page.

We ask that the users report any comment or post that violate the rules, to use critical thinking when reading, posting or commenting. Users that post off-topic spam, advocate violence, have multiple comments or posts removed, weaponize reports or violate the code of conduct will be banned.

All posts and comments will be reviewed on a case-by-case basis. This means that some content that violates the rules may be allowed, while other content that does not violate the rules may be removed. The moderators retain the right to remove any content and ban users.