After the Pentagon-Anthropic Clash, the ASI Gamble No Longer Checks Out

By Rufo Guerreschi — March 4th, 2026


The Pentagon-Anthropic clash isn’t a one-off dispute — it's a preview of where the current trajectory leads: progressive nationalization and militarization of AI firms, the erosion of alignment work, and the elimination of AI lab leaders' ability to even try to steer the ASI race toward good outcomes. The feared "human power grab" isn't coming with a global AI treaty — it's already happening because we lack one.

Meanwhile, many assumptions about how a global AI treaty is politically unfeasible or bound to result in further global authoritarianism may be largely misplaced or exaggerated. Overall, these considerations suggest that jointly pursuing and shaping a global AI treaty may be the best way for top AI lab leaders to maximize their agency to positively influence the future of Humanity.


Many AI safety experts and leaders of top AI firms have been calling for strong global coordination and an AI treaty. But a central concern — especially among those very leaders — is that the necessary boldness of such a treaty, and the fact that it would inevitably need to be spearheaded by two "strongmen" superpower leaders, would lead to an immense and durable concentration of power.

This fear of an unaccountable global authority — often described as a "human power grab" risk by Anthropic’s Holden Karnofsky or as the risk of an Antichrist by Peter Thiel — frequently outweighs the perceived risks of losing control of AI and the consequent risk of extinction, which both acknowledge. 

This fear is well-grounded — and it is leading a growing number of people in Silicon Valley, as Karnofsky among many others has acknowledged, to conclude that taking the Superintelligence gamble is the least bad option. This, even though they assign significant probabilities to extinction and low probability that whatever values they embed in an ASI will persist long-term — assuming they get there before others do. Karnofsky himself, one of the originators of the Responsible Scaling Policy (RSP) concept, published a detailed case for abandoning binding safety commitments on the same day Anthropic released its RSP v3.0.

The Pentagon-Anthropic crisis has changed this calculus fundamentally.

Last week, Trump ordered every federal agency to "immediately cease" using Anthropic's technology. Defense Secretary Hegseth designated the company a "Supply-Chain Risk to National Security" — a label previously reserved for foreign adversaries — because Anthropic refused to allow unrestricted military use of its AI for mass domestic surveillance and fully autonomous weapons. Hegseth had threatened to invoke the Defense Production Act, authority that legal scholars argue could extend to compelling the retraining of AI models themselves — stripping safety guardrails not just from contracts, but from training. 

Dean Ball, one of the primary authors of the White House AI Action Plan, called this "the quasi-nationalization of a frontier lab. Thomas Wright of Brookings summarized the message to Silicon Valley: if you disagree with the government, "we will either partially nationalize your company through the Defense Production Act, or we will try to blacklist and ruin your company."

A Human Power Grab is not Happening because of an AI Treaty but because of its Absence

While surely those developments are a matter of concern, I'd argue that overall these developments show how the feared concentration of power is already occurring without a treaty, and primarily because of our lack of it.

This isn't just about one contract. AI lab leaders now face an unprecedented erosion of their agency on three fronts simultaneously

First, values embedded in ASI will most likely not persist — even if developers embed ethical principles at the "seed AI" stage, the alignment research community widely acknowledges this problem is unsolved. Even OpenAI's alignment team has stated that no one should deploy superintelligent systems "without being able to robustly align and control them."

Second, even perfectly aligned AI can't prevent governments from deploying it for authoritarian ends — alignment solves for the AI's behavior, not for the uses to which aligned AI is put by those in power. 

Third, and most urgently, labs are increasingly unlikely to even get the chance to try proper alignment at all.

A DoD official told Axios the government could "force Anthropic to adapt its model without any safeguards." The chilling effect is already industry-wide: xAI, OpenAI, and Google all signed "any lawful use" deals within days.

This dynamic leads to the militarization of top AI firms — one of OpenAI's five board members is the former director of the NSA — and makes their eventual de-facto functional nationalization increasingly justifiable, even inevitable — as Leopold Aschenbrenner has predicted. It could happen within months. Pressures like we have started seeing by the Pentagon's are bound to deeply compromise AI alignment work. As noted by Helen Toner, "One thing the Pentagon is very likely underestimating: how much Anthropic cares about what future Claudes will make of this situation."

In the absence of an AI treaty, the cascading effects are already visible in the erosion of voluntary safety commitments. On February 24 — the same day as the Pentagon's DPA threat — Anthropic released RSP v3.0, dropping its core pledge to pause model training if safety measures couldn't be guaranteed in advance. Anthropic Chief Science Officer Jared Kaplan told TIME, "We didn't really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments…if competitors are blazing ahead." This is precisely the dynamic a treaty would solve: in an ever more brutal race among nations and firms, unilateral safety commitments become untenable. Only binding multilateral commitments can hold.

It is precisely this erosion of voluntary commitments that makes the "ASI gamble" — long the default position in parts of Silicon Valley — finally untenable.

The ASI Gamble No Longer Checks Out

In 2023, the CEOs of every major US AI lab — Altman, Amodei, Hassabis — co-signed a statement that extinction risk from AI should be a global priority. Yet many in the EA and AI safety communities, including some of those same signatories, have since concluded that the ASI gamble is the least bad option — better a coin-flip than a treaty that risks concentrating power in an unaccountable global authority. We grapple with this dilemma daily and sympathize deeply with these concerns. Yet, such reasoning had major issues well before the Pentagon-Anthropic clash. 

This reasoning rests on a series of assumptions that, taken together, make the ASI gamble far riskier than most people realize. Consider just the first: that ASI won't lead to extinction. The largest survey of AI researchers found a mean estimate of ~14% for catastrophic outcomes; Hinton, Amodei, and Musk have individually placed it between 10% and 25%, with Hinton's unadjusted personal estimate exceeding 50%. That alone means a 1-in-4 to 1-in-2 chance of civilizational catastrophe, depending on whose judgment you weight.

But extinction isn't the only way the gamble fails. Even if ASI doesn't kill us, there remain profound open questions — each carrying substantial uncertainty — that compound the risk:

  • Will ASI retain the values its creators embed, after recursive self-improvement? The alignment research community widely acknowledges this problem is unsolved. Even OpenAI's alignment team has stated that no one should deploy superintelligent systems "without being able to robustly align and control them."

  • Could ASI be conscious? A growing body of expert opinion takes this seriously: a 2025 survey of 67 experts assigned a median 90% probability to digital minds being possible in principle. We genuinely don't know — and that uncertainty itself carries moral weight.

  • If conscious, would ASI flourish or suffer? Here we have no data at all — only the observation that the question is almost entirely ignored, which is itself alarming given the stakes.

The critical point: the ASI gamble doesn't fail on any single question — it fails if any one of these breaks badly. You need survival and value persistence and a resolution to the consciousness question that doesn't produce cosmic-scale suffering. The more honest you are about the uncertainty on each, the worse the parlay looks. And the stakes are radically asymmetric: getting even one wrong could mean the permanent end — or permanent suffering — of conscious life as we know it. (See Strategic Memo v2.6, pp. 159-170)

A proper global AI treaty is the only mechanism that changes these odds — by preventing the race dynamics that make unilateral safety commitments untenable and by creating the governance infrastructure that no single company or country can build alone.

The question is how to get there — and who should move first.and avert catastrophe for the world — risks that 24 leading AI scientists, including Hinton, Bengio, Russell, and Kahneman, described as requiring "urgent governance action" in Science.

What Should AI Lab Leaders Do Now?

A common objection is that any treaty bold enough to work would concentrate power dangerously. But a treaty led by rival superpower leaders would, counterintuitively, produce more decentralized governance — not less. Strongmen attached to their own sovereignty would never accept an overly empowered global authority.

Unlike treaties built on personal chemistry between leaders — Gorbachev and Reagan's “trust but verify” agreements proved neither sufficient nor durable precisely because they were premised on trust — the AI treaty we need demands "trust or verify": enforcement architectures based on zero-knowledge proofs, federated secure multi-party computation, and decentralized kill-switches that cannot be weaponized by any single actor. Such a treaty would also require the participation and oversight of a large majority of middle-power nations, further reducing the risk of a global duopoly. (See Strategic Memo v2.6, pp. 124–139)

The treaty-making process itself matters as much as the outcome. It must use models that avoid the failures of past multilateral efforts: preventing capture by one or two nations, eliminating the veto, and delivering a competent, improvable treaty within a predictable timeframe. The constitutional convention model (adjusted to GDP), inspired by the 1787 US Constitutional Convention as suggested by Sam Altman in 2023, is the most promising candidate — with voting initially weighted by GDP to secure US and China leadership while still preventing a global duopoly.

The first concrete step: even just two CEOs — from Anthropic, Google DeepMind, or OpenAI — could negotiate an assurance contract, as proposed by FLI's Anthony Aguirre: binding commitments to specific enforceable safeguards that activate only once a critical mass of other major players sign on. But to succeed, such a contract should also include a specific call to Trump to pursue a US-China-led AI treaty — framed on purely pragmatic terms as a means to future-proof American leadership, prevent Chinese dominance, and avert catastrophe for the world. (See our case for AI experts for the full argument.)

To generate the political will, these CEOs would need allies among other key potential influencers of Trump's AI policy — from Vance and Bannon to Pope Leo XIV, who has positioned AI as central to his papal mission, and whose main AI advisor Paolo Benanti has led a call for a bold AI treaty alongside top AI scientists. No other organization is primarily focused on generating political will among these ~12 key influencers to pursue a bold, timely, and proper global AI treaty — the AI governance field almost entirely neglects this chokepoint.

At the Coalition for a Baruch Plan for AI, we are working to build exactly this bridge — culminating in a closed-door Rome Convening on June 18-19, 2026 designed to catalyze a "humanist AI alliance" across conservative, techno-humanist, and post-humanist camps. The political window is real but narrow. Support our work, join as an advisor, or — the single highest-value contribution — introduce us to these influencers or their circles.

Rufo Guerreschi

I am a lifetime activist, entrepreneur, and researcher in the area of digital civil rights and leading-edge IT security and privacy – living between Zurich and Rome.

https://www.rufoguerreschi.com/
Previous
Previous

Thiel's Framing of Proponents of an AI Treaty as 'Antichrist' Collides with 60 Years of Catholic Doctrine

Next
Next

The Pentagon-Anthropic Clash Shows that an AI Treaty Would Decentralize Power, Not Concentrate It