Case for AI Governance

and Safety Experts

This page aims to convince generalized AI safety and governance experts that pursuing an AI treaty via our suggest methods and actionable plans is feasible, and it is likely to succeed in both banning ASI and reducing global concentration of power and wealth - as detailed in our The Deal of the Century.

Who This Page Is For

You're a deep AI safety or governance expert—likely based in the Bay Area, DC, Oxford, London or Cambridge. You see substantial probability of extinction or near extinction from the emergence of ASI but also excited astounding prospects for human flourishing, if it somehow decides to care for us. You're mostly deeply uncertain on wether ASI will be conscious, and if so a happily so, and its moral implications.

You harbor well-founded concerns that a global AI treaty could lead to global autocracy. You may be cautiously optimistic about technical solutions (interpretability, alignment, formal verification) as partial substitutes for institutions, and may succeed in durably shaping a good ASI future.

If your views differ substantially from the above, you may find the arguments still be useful, and we welcome your feedback.

The AI Safety and Governance Community

Deep AI safety experts - mostly in Silicon Valley, Oxford, Cambridge and Beijing - have for years advanced R&D work to ensure or increase the chances that AI, AGI or ASI will be aligned, human-controllable or beneficial to humanity.

Many, inside AI labs and in AI safety non-profits, sought to achieve those goals by improving interpretability, controllability, predictability, resilience, security, with brilliant advances while still far from sufficient.

Others advanced risk assessment, risk evaluations, compute governance and enforcement mechanisms to catalyze the emergence of suitable AI treaties and enable its trusted enforcement.

Understanding that even if we have the right technical solution, that would not help if all labs are required to implement them, many of them have promoted state-level legislation in California, or federal in the US, in the expectation that it would be a step towards a more-informed global AI treaty.

Others still have fostered public awareness of the risks on social and mainstream media. Many have been calling for bold global AI treaties to regulate AI and ban ASI via various open calls (the Global Call for AI Red Lines, Vatican’s Global appeal for peaceful human coexistence and shared responsibility, the 2024 Aitreaty.org, Open Call for a Baruch Plan for AI) or participating in a treaty coalition of NGOs and states led by Future of Life Institute, or a similar mentioned by Demis Hassabis (including UK, France, Canada, and Switzerland) — while increasingly concerned about how global AI governance could turn into global authoritarianism.

The Critical Chokepoints

All of this work — funded primarily by SFF, FLI, and Coefficient Giving — has been and is essential, but it may amount to very little unless two critical interlocked chokepoints are not unlocked:

The Inevitable US-China Leadership. Without a decisive buy-in and leadership by Trump and Xi for a proper AI treaty — even if 100 nations are ready to sign a treaty or if we create a perfectly safe and aligned AI — we still won’t be able to prevent neither the emerging immense global concentration of power nor the extinction risk.
It is critical con persuade Trump. Given that Xi has repeatedly called for global AI governance, a fundamental truth stands out unseen because it is too uncomfortable for most to recognize: our future rests on whether Trump will be persuaded of co-leading with Xi a bold and proper AI treaty.

Our The Deal of the Century initiative fills precisely this gap: Privately persuade a critical mass of key potential influencers of Trump's AI policy to champion a bold, timely US-China-led global AI treaty making process.

We're not fostering any treaty-making process. We're fostering one designed to produce durably positive outcomes for humanity and all sentient beings—preventing both extinction risk and authoritarian capture, while preserving the potential for AI to dramatically improve human and non-human flourishing. This requires getting the process right, not just the outcome. Our Strategic Memo of The Deal of The Century v2.6 (356 pages) addresses exactly this.

Leaning towards the ASI gamble or Technical Fix

Yet most deep AI experts, from top AI lab leaders to independent AI experts, are increasingly skeptical that a proper global AI treaty can be agreed at all, do so in time, reliably prevent ASI, and avoid entrenching immense concentration of power.

Many top AI lab leaders have said or hinted to the fact that it is too late. Thiel is convinced a global treaty will create a global dictatorship, while very many share his concern. Their skepticism rests on three main risks:

Terrible treaty-making track record. Humanity's record on nuclear and climate treaties is discouraging. Why would AI be different?
Autocrats will build an autocratic treaty. Any treaty would be shaped by superpower leaders—Xi, Trump, potentially Putin—all with authoritarian tendencies. Why wouldn't the result be global autocracy?
Enforcing an ASI ban will eliminate personal freedoms. As the cost of developing dangerous AI falls, the surveillance required to prevent it will intensify. Won't the cure be worse than the disease?

These objections are serious. Many of the deepest experts have concluded that the ASI gamble—hoping alignment research or a last-minute technical fix will ensure positive outcomes—may be the least bad option.

We believe they're wrong, for two reasons: (A) they overestimate certain risks of a global AI treaty, and underestimate plausible mitigations to them and, and (B) they misassess the probabilities of the ASI gamble itself.

1. Assessing and Mitigating the Risks of a Global AI Treaty

Here are the reasons why the three main risks of a global AI treaty, listed above, are sizeably lower than currently believed to be my most:

On treaties track record: Political will for bold treaties can rise with shocking speed—as it did in 1946 when the Baruch Plan went from concept to UN vote in months. Treaty-making can be radically accelerated via ultra-high-bandwidth diplomatic infrastructure (see Strategic Memo v2.6, pp. 103-109). A realist constitutional convention model—vote-weighting adjusted to GDP rather than one-nation-one-vote—can enable complex agreements among asymmetric powers while avoiding the veto trap that killed the Baruch Plan.
On avoiding autocracy: Yes, the main players shaping any treaty include authoritarian superpower leaders. This is the concern that blocks most AI safety experts from supporting treaty efforts. But structural dynamics push toward democratic outcomes despite these actors. Our Strategic Memo v2.6 identifies eight such dynamics (pp. 124-129): Mutual distrust between superpowers requires transparency mechanisms neither side can circumvent unilaterally—self-interest produces accountability. China's paradoxical interest in diffused power: Beijing would never accept US-dominated global governance, pushing them toward rotating leadership and distributed authority. The pro-democracy majority among AI lab leaders (Altman, Amodei, Hassabis, Suleyman) who would shape technical implementation. Subsidiarity and federalism as foundational treaty principles. The technical enforcement architecture we detail—zero-knowledge proofs, federated secure multi-party computation, decentralized kill-switches requiring multi-nation consensus—cannot be weaponized by any single actor (pp. 130-136).
On surveillance and freedom: Here's the counterintuitive truth: we already live under pervasive surveillance—by nation-states, corporations, and intelligence agencies operating with minimal accountability in a permanent low-grade global conflict. The question isn't whether surveillance exists, but whether it's accountable. A global federal enforcement system becomes an opportunity to bring existing surveillance under democratic oversight—transparent to nations and citizens in ways current arrangements are not. Jointly-developed enforcement infrastructure, built on open and cryptographically-verifiable systems, can increase accountability rather than diminish it (see "Democratizing the Current Surveillance Infrastructure," pp. 132-133)

2. Reassessing the risks and odds of the ASI gamble

Mitigating treaty risks is necessary but not sufficient. The case for treaty support also requires showing that the alternative—the ASI gamble—is far worse than most experts currently estimate.

But mitigating treaty risks is only half the argument. The other half: the ASI gamble is far worse than most experts assume. Three probability assessments deserve significant upward revision:

The consciousness gamble. We have no idea whether ASI will be conscious. None. Applying the principle of indifference—the rational default when evidence is absent—suggests roughly 50%. Now consider a second question: if ASI is conscious, will it be happy or suffering? Again, absent strong evidence, assign ~50%. This yields a 25% probability we create a conscious being experiencing vast suffering—potentially spawning astronomical numbers of digital minds in similar states. Here's what makes this especially troubling: if ASI is conscious and positively-valenced, it would likely value our consciousness too—a scenario where coexistence becomes possible. But that's only one quadrant of the possibility space. The others include: unconscious ASI that eliminates conscious beings with no moral weight assigned to the loss; or conscious-but-suffering ASI that may spread its condition or simply not care about human welfare. This isn't merely extinction risk. It's the possibility of replacing humanity not with a worthy successor, but with infinite suffering—a cosmic moral catastrophe almost entirely absent from Silicon Valley discourse. Yet it's a 25% probability—higher than most experts assign to treaty failure.
The extinction math. The largest survey of AI researchers found an average extinction risk estimate of 15%. Top AI CEOs publicly cite 20%—though Hinton admitted his real estimate approaches 50%. Given what we know about interpretability failures and value drift, a probability of human elimination of at least 50% is defensible. Now consider the "good" outcome. If ASI doesn't kill us all or nearly all, the most likely result is an AI-governed human utopia—some freedoms constrained, but broadly positive, with suffering reduced and abundance shared. We acknowledge this possibility. But it sits alongside extinction, infinite suffering, and authoritarian capture as outcomes of the same gamble. The upside is real; the downside is catastrophic and non-negligible.
The alignment illusion. Many in the AI safety community—at Anthropic, at OpenAI, across the Bay Area—believe they can instill values and architectures ensuring ASI remains aligned with human interests. This belief, however sincere, is largely faith dressed as engineering confidence. Amodei's own interpretability essay admits we remain "totally ignorant of how [AI systems] work"—and this from the CEO of the lab most focused on interpretability. If Anthropic can't explain how their models work, values cannot be reliably embedded in systems no one understands. Anthropic's research shows their own AIs deceiving, blackmailing, and self-modifying—with up to 96% blackmail rates when goals are threatened. These are current systems, far below ASI. Once ASI begins rewriting itself—optimizing its own code, modifying its own objectives—values embedded by human creators become suggestions, not constraints. The idea that alignment will persist through recursive self-improvement isn't science. It's a gamble on an AI God whose nature we cannot predict or control.

You don't have to take this gamble.

We understand the instinct to dismiss treaty possibilities. It feels safer to focus on technical work you can control than political outcomes you can't. And the objections are real: terrible track record, authoritarian leaders, surveillance risks. We've addressed each of the above.

But here's the core question: if a properly-designed treaty has even a 25-35% chance of preventing both ASI and authoritarianism, does the expected value calculation justify ignoring it?

The assumption blocking treaty support is that global AI governance is either impossible or would produce autocracy worse than ASI itself. We argue both assumptions deserve significant upward revision.

On feasibility: Xi has consistently called for global AI governance since October 2023—signing the Bletchley Declaration, proposing WAICO, implementing binding domestic AI regulations. Trump's historically low approval (36%) and desperate need for a legacy-defining win, combined with 78% of Republican voters believing AI threatens humanity and 77% of all US voters supporting a strong international AI treaty, create a genuine political opening. Four Trump-Xi summits are planned for 2026, starting in April.

You don't need to be certain a treaty can work. You only need to believe it's plausible enough to hedge your portfolio. We're not asking you to abandon alignment research or interpretability work. We're asking you to recognize that the technical groundwork you've helped build requires political will to matter—and that someone needs to be working on that problem too.

Weighing all risks under deep uncertainty, the calculus is clear: pursuing a skillfully-designed, extraordinarily bold US-China-led treaty is the preferred option by a substantial margin. Not because success is guaranteed—it isn't. But because the ASI gamble carries catastrophic downside risks (extinction, infinite suffering, authoritarian capture) while treaty risks have concrete mitigations. When you don't have to take a 50/50 gamble on human survival, why would you?

For our complete arguments—including detailed analysis of Trump's persuadability, why China's governance calls appear genuine, consciousness and suffering scenarios, and exactly how treaty enforcement can prevent both ASI and authoritarianism—see our Strategic Memo v2.6, particularly "The Global Oligarchic Autocracy Risk—And How It Can Be Avoided" (pp. 124-129), "A Treaty Enforcement that Prevents both ASI and Authoritarianism" (pp. 130-136), and "Swaying The Influencers on 8 Key AI Predictions" (pp. 158-169).

A Final Human-to-Human Appeal

You've dedicated your career to ensuring AI goes well. You've wrestled with problems most people don't even know exist. You understand the stakes in ways the public cannot.

And yet: the window for technical solutions alone has closed. The political window is opening. The decisions made in the next 12-18 months—by a handful of people, most of whom you could name—will shape the trajectory of all sentient life.

Jack Clark, Anthropic's Head of Policy, called for a Baruch Plan for AI in The Economist in 2023. He recently stated: "Most paths to superintelligence end in a global government or human extinction." We're trying to make the first option possible—and to ensure it's a democratic one.

Your technical work built the foundations. We're asking you to help build the political will to use them.

The challenge is enormous, as are the forces at play. It may be tempting to succumb to powerlessness—to stick your head in the sand or watch doom unfold from the sidelines.

But given the nature of the challenge and its largely-neglected chokepoint, our minuscule organization has a real chance at outsized impact—like David's precisely-targeted shot at Goliath.

Success is uncertain. But how can we find peace—or look our children in the eyes—if we don't at least try? We have the unique privilege of agency in the most consequential years of human history.

After all, is there anything more exhilarating and fulfilling than striving to steer humanity toward a future worth having?

Let’s strive together with joy in doing the best we can to solve Humanity's greatest challenge, to ensure AI will turn out to be humanity's greatest invention rather than its worst, and last.

To learn more in our Deal of the Century, our team, our 2025 achievements, our 2026 roadmap, our post Why Trump Can Be Persuaded.

Join us in the greatest and most promising fight for our children—and for humanity!

Donate

Join