Case for AI Governance
and Safety Experts
Why The Deal of the Century on AI Is Needed
Deep AI experts like you have spent many years advancing technical solutions to AI safety, alignment, and governance. You sought to imrpove interpretability, controllability, resilience, security, predictability, risk assessment, risk evaluations, compute governance and enforcement mechanis. Others among you advanced public awareness of the risks on social and mainstream media. Othershave promoted state-level legislation in California, often in the hope that it would be a step towards some sane global rules.
Many of you have been calling for bold global AI treaties to regulate AI and ban ASI via various open calls (the Global Call for AI Red Lines, Vatican’s Global appeal for peaceful human coexistence and shared responsibility, the 2024 Aitreaty.org, Open Call for a Baruch Plan for AI) or participating in a treaty coalition of NGOs and states led by Future of Life Institute, or a similar mentioned by Demis Hassabis (including UK, France, Canada, and Switzerland) — while increasingly concerned about how global AI governance could turn into global authoritarianism.
This work — funded primarily by SFF, FLI, and Coefficient Giving — has been and is essential, and may prove critical in the future. But it may well amount to something unless two critical interlocked chokepoints is not unlocked.
Without a decisive buy-in and leadership by Trump and Xi — even if 100 nations are ready to sign a treaty or if we create a perfectly safe and aligned AI — we still won’t be able to prevent neither the emerging immense global concentration of power nor the extinction risk.
Given that Xi has repeatedly called for global AI governance, a fundamental truth stands out unseen because it is too uncomfortable for most to recognize: our future rests on whether Trump will be persuaded of co-leading wit XI a bold and proper AI treaty.
Our The Deal of the Century initiative fills precisely this gap: privately persuading a critical mass of key potential influencers of Trump's AI policy to champion a bold, timely US-China-led global AI treaty making process.
We're not fostering any treaty-making process. We're fostering one designed to produce durably positive outcomes for humanity and all sentient beings—preventing both extinction risk and authoritarian capture, while preserving the potential for AI to dramatically improve human and non-human flourishing. This requires getting the process right, not just the outcome. Our Strategic Memo v2.6 (356 pages) addresses exactly this.
Who This Page Is For
We’ll necessarily generalize.. You are a deep AI safety and governance expert from the Bay Area. You see a substantial probability of human extinction from ungoverned AI progress, but also open to the possibility of a great future if ASI goes well. Likewise, you are deeply uncertain about ASI consciousness and its moral implications; you harbor well-founded concerns about global treaty → global autocracy. You are optimistic about the likelihood that values embedded in an ASI AI values and the capacity of technical solutions (open source, crypto, formal verification) to replace rather than complement proper global institutions. If your views differ substantially, good—the arguments may still be useful, and we welcome the feedback.
Why a Proper Treaty Is Worth Pursuing
Our core argument is simple. We believe your probability estimates on several key AI predictions are likely lower than they should be—and that adjusting even one of them upward would shift the expected value calculation decisively toward supporting our initiative.
This isn't about convincing you that doom is certain, or that treaties are easy, or that Trump is secretly a multilateralist. It's about marginal updates to probabilities you already take seriously. If you currently assign 15% to "China wins the ASI race" and our arguments move you to 30%, that alone may change your strategic calculus. If you currently assign 10% to "a proper treaty can prevent both ASI and authoritarianism" and we can show you reasons to update to 25%, the expected value of supporting treaty-focused work rises substantially.
The predictions below are ordered roughly by how much updating we think is possible for the typical reader. Prediction 7—that a proper treaty can prevent both ASI and authoritarianism—is where we believe the largest updates are available, and where the concerns of most AI experts and lab leaders are concentrated.
Prediction 0: That Trump can be persuaded in the next months to pursue a bold US-China-led AI treaty Higher than you think. Trump's historically low approval (36%), need for a "big win," hyper-pragmatic non-ideological style, and voters' skyrocketing AI fears (78% of Republicans believe AI threatens humanity) create an opening. Xi has consistently called for global AI governance. Four Trump-Xi meetings are planned for 2026. (See below: Detailed Case: Prediction 0)
Prediction 1: That China will win the ASI race The expectation that China might be the leading nation in developing AGI and then ASI, potentially leading to world domination. We argue this is higher than most assume. DeepSeek's emergence proves the gap is closing. Racing becomes self-defeating if China might win anyway. (See below: Detailed Case: Prediction 1)
Prediction 2: That ASI will result in human extinction, near-extinction, or dystopia If you're already in the high range, the case for decisive action is overwhelming. Even the low end justifies extreme measures. (See below: Detailed Case: Prediction 2)
Prediction 3: That ASI will be unconscious, or conscious but unhappy The idea that ASI may lack consciousness entirely or, if conscious, may experience more suffering and less happiness than humans, or do so much more. Apply the principle of indifference—~50% baseline for each. The expected suffering calculus is staggering. (See below: Detailed Case: Prediction 3)
Prediction 4: That ASI will discard the values embedded by its original human creators Concerns that AI might abandon the ethical principles and objectives set by humans, acting in ways that are misaligned with human interests. We argue this is higher than commonly assumed. Amodei's own interpretability essay admits we are "totally ignorant of how [AI systems] work." Values can't stick in systems we don't understand. (See below: Detailed Case: Prediction 4)
Prediction 5: That we live in a (computer) simulation The hypothesis that our perceived reality is in reality a simulated environment created for some reason by an external entity. Variable relevance. (See below: Detailed Case: Prediction 5)
Prediction 6: That an aligned ASI, if created, may have to battle unaligned ASIs The possibility that multiple AIs with differing objectives or masters could come into conflict, with aligned AIs possibly losing against unaligned ones. Without global coordination, this becomes near-certain in a multi-polar ASI scenario. (See below: Detailed Case: Prediction 6)
Prediction 7: That a proper AI treaty and its enforcement will prevent both ASI and authoritarianism This is the key blocker. We argue this is higher than commonly assumed. Our Strategic Memo v2.6's enforcement architecture directly addresses the Karnofsky "human power grab" concern with eight structural safeguards (see "The Global Oligarchic Autocracy Risk—And How It Can Be Avoided," pp. 124-129). (See below: Detailed Case: Prediction 7)
Prediction 8: That their own agency to shape the future is maximized via a proper treaty The confidence that co-leading a proper AI treaty would be the primary or co-primary way they could influence the future of AI and humanity with their values or visions. Racing toward systems you cannot understand reduces agency. A treaty framework offers more control. (See below: Detailed Case: Prediction 8)
(If you are keen to know more about our reasoning, please refer to our Strategic Memo, or read below the Detailed Case section.)
A Heartfelt Appeal
You've dedicated your career to ensuring AI goes well. You've wrestled with problems most people don't even know exist. Furthermore, you understand the stakes in ways the public cannot.
But here's the uncomfortable truth: The window for technical solutions alone has closed. The political window is opening. The decisions made in the next few months—by a few people, most of whom you could name—will shape the trajectory of all sentient life.
We're not asking you to believe everything we believe. We're asking you to consider whether your current probability estimates—particularly on whether a proper treaty can prevent both ASI and authoritarianism—might be lower than the evidence warrants. If even one of your key estimates should be 10-20 percentage points higher than it currently is, the expected value of supporting our work changes dramatically.
We're not asking you to abandon your current work. We're asking you to hedge your portfolio with a high-leverage political intervention that complements other AI safety and governance initiatives. We share your goals. We simply pursue them through a different coalition.
With just $75K, we built a 356-page strategic arsenal, generated 85+ contacts toward key influencers, and established direct pathways to multiple primary targets. We operate at ~$7,500/month—a fraction of typical DC policy organizations.
Our Track Record and Capital Efficiency
With only $75,000 in total funding (primarily from SFF and Ryan Kidd), we achieved:
Strategic Arsenal:
354-page Strategic Memo profiling every key influencer's interests, philosophy, psychology, and AI predictions
More actionable intelligence on Trump's AI influencers than any other document
Regularly updated and expanded (v3.0 due January 30, 2026)
Pathways to Influencers:
85+ contacts generated during October 2025 US Persuasion Tour
23+ direct AI lab official engagements
Direct pathways to 2 of 10 primary influencers
Two-day field events at AI lab headquarters
Operational Efficiency:
~$7,500/month burn rate—a fraction of typical DC policy organizations
2,100+ hours of professional pro-bono work activated
No fancy offices, no high salaries, no bureaucratic overhead
See full details at 2025 Achievements.
Donate Now
Via credit card or PayPal: Patreon
Via credit card, wire, or crypto: Manifund
Via wire transfer:
Recipient: Coalition for a Baruch Plan for AI ETS
IBAN: BE17 9054 0557 7821
Swift/BIC: TRWIBEB1XXX
Other Ways to Help:
Join as contributor — Your expertise could sharpen our enforcement mechanism designs, influencer profiles, or treaty frameworks. Contact us
Refer us to funders — Know someone at SFF, LTFF, CG, Open Philanthropy, or other aligned foundations? An introduction could multiply our impact
Share with your network — Forward this to colleagues who might support
Questions? Contact our Executive Director at rufo@trustlesscomputing.org
DETAILED CASE
The argument structure: For each of the 8 key AI predictions (plus Prediction 0 on Trump's persuadability), we estimate where you likely stand, explain why we think your estimate should be higher, and show how even a modest upward revision changes the expected value of supporting treaty-focused work. You don't need to be convinced on all 9. A significant update on even one or two—particularly Prediction 7 (treaty can prevent both ASI and authoritarianism)—may be sufficient to shift your overall calculus.
Table of Contents
Strategic Complementarity: Why Your Work Needs This — How our initiative fills the gap between technical governance infrastructure and political will
Addressing the Bay Area Expert's Priors — Engaging honestly with the assumptions that may be blocking your support
Prediction 0: Trump's Persuadability — 22 reasons this is more tractable than you think
Prediction 1: That China Will Win the ASI Race — DeepSeek and the closing gap
Prediction 2: That ASI Will Result in Extinction — The case for taking your own probability seriously
Prediction 3: That ASI Will Be Unconscious, or Conscious but Unhappy — The neglected cosmic tragedy
Prediction 4: That ASI Will Discard the Values Embedded by Its Creators — Amodei's own arguments support treaty-making
Prediction 5: That We Live in a Simulation — Variable relevance
Prediction 6: That an Aligned ASI May Have to Battle Unaligned ASIs — Without coordination, chaos is certain
Prediction 7: That a Proper Treaty Can Prevent Both ASI and Authoritarianism — Directly addressing the CG network's core concern
Prediction 8: That Their Agency Is Maximized via a Proper Treaty — Why racing reduces your influence
Our Track Record and Capital Efficiency — What we achieved with $75K
Funding Needs and How to Help — $60K-$400K to transform capacity
1. Strategic Complementarity: Why Your Work Needs This
The AI safety and governance ecosystem has achieved remarkable things:
Technical groundwork: Anthropic's Constitutional AI, interpretability research, compute governance frameworks, alignment methodologies—all funded largely through SFF, Open Philanthropy (now Coefficient Giving), and related networks.
Policy experimentation: California's SB 1047 and similar state-level legislation attempts demonstrate that aggressive regulatory frameworks can be advanced even in the current political environment. These could inform global treaty design.
Coalition readiness: DeepMind's Hassabis has advanced the idea of a non-superpower coalition including UK, France, Canada, and Switzerland. FLI is quietly aggregating a similar coalition of states and NGOs. These initiatives are praiseworthy and essential.
But sequencing matters enormously.
Without US-China participation, coalitions of non-superpower nations are completely ineffective. Worse: if presented before superpower buy-in, they risk stealing the thunder from Trump and Xi—potentially derailing the very political will they need (see Strategic Memo v2.6, "Coalitions of Non-Superpower Nations: Valuable Complements, Dangerous Substitutes," pp. 112-113).
The rule is simple: coalitions launched as substitutes for superpower leadership are counterproductive. Launched as complements following a US-China declaration of intent, they become highly valuable.
Our initiative is the missing piece: We focus exclusively on the political will problem—privately persuading key influencers of Trump's AI policy to champion a treaty that would incorporate everything the technical governance community has built.
Jack Clark, Anthropic's Head of Policy, explicitly called for a Baruch Plan for AI in The Economist in 2023. He recently stated: "Most paths to superintelligence end in a global government or human extinction."
We're trying to make the first option possible.
2. Addressing the Bay Area Expert's Priors
Let's be direct about the assumptions that may be blocking your support.
The Libertarian Governance Bias
Bay Area culture—Berkeley especially—has produced sophisticated thinking about digital privacy, decentralization, and the dangers of centralized power. This thinking is largely correct at the domestic level.
But it has also produced a systematic bias against any global governance, even the best-designed democratic governance with robust safeguards. The instinct is to reach for technical solutions: open source, decentralization, crypto, formal verification.
Here's the problem: technical solutions cannot replace governance—they can only complement it.
Open source doesn't prevent a state actor from using open-source models for dangerous applications
Decentralization doesn't coordinate the 15+ major AI labs racing toward ASI
Crypto doesn't solve the fundamental problem of multiple actors with different values competing for control of transformative technology
Formal verification cannot prove the absence of deceptive alignment in systems we don't understand
The same naive libertarian instinct that assumed Signal could singlehandedly protect democracy now assumes that alignment research can singlehandedly prevent ASI catastrophe. Both underestimate the role of institutions.
The Authoritarian Capture Fear
Your fear is well-founded: a global AI treaty could create a dystopia worse than ASI itself. Holden Karnofsky's warning about "human power grab" is precisely right as a concern.
But concern should motivate design, not nihilism.
This is where we believe the largest probability updates are available. If your current estimate that "a proper treaty can prevent both ASI and authoritarianism" sits at 10-15%, we believe the architectural solutions detailed in our Memo should move you to 25-35% or higher. That single update changes the expected value calculation substantially.
Our Strategic Memo v2.6's chapter "The Global Oligarchic Autocracy Risk—And How It Can Be Avoided" (pp. 124-129) identifies eight structural factors that make authoritarian capture far less likely than it initially appears:
Mutual distrust as transparency engine — For a treaty to be credible to any self-interested leader, it must include enforcement mechanisms none of them can circumvent unilaterally
China's paradoxical interest in democratic global governance — Beijing would never accept a model that threatens Chinese sovereignty; their self-interest pushes toward rotating leadership and diffused power
Pro-democracy structural majority among AI lab leaders — Altman, Amodei, Hassabis, and Suleyman all have stated commitments to democratic values
Pope Leo XIV's moral authority — A Christian/humanist alliance provides counterweight to raw power
Legacy incentives — Even autocrats want to be remembered well; founding a global dystopia is bad for legacy
Anti-bureaucracy safeguards — Trump's own instincts align with limiting centralized control
Subsidiarity principles — Radical federalism built into treaty architecture
Citizen oversight mechanisms — Jury-style participation, whistleblower protections, distributed verification
Our chapter "A Treaty Enforcement that Prevents both ASI and Authoritarianism" (pp. 130-136) details how zero-knowledge proofs, federated secure multi-party computation, and decentralized kill-switch protocols can create enforcement that is effective against ASI and impossible for any single actor to weaponize.
The technical governance work you've funded makes this possible. We're trying to make it politically viable.
Prediction 0: Trump's Persuadability
Most AI and political experts believe Trump's radical unilateralism makes a bold global AI treaty impossible. They are wrong.
We've identified 22 specific reasons why Trump could be persuaded:
Political dynamics:
Trump's approval ratings are at historic lows (36%)—he needs a big win
78% of Republican voters believe AI could threaten humanity
77% of all US voters support a strong international AI treaty
53% of Americans believe it's likely AI will destroy humanity
Diplomatic opportunity:
Xi Jinping has consistently called for global AI governance
Four Trump-Xi meetings planned for 2026, starting April
Singapore and other bridge nations available as neutral venues (see "Bridge Nations: Venues, Infrastructure, and Legitimacy," pp. 110-112)
Trump's psychology:
Hyper-pragmatic and non-ideological—relies on instinct and trusted advisors, not doctrine
Penchant for big deals and unpredictable pivots
Aversion to weak multilateral institutions—but a strong treaty with real enforcement could appeal
Deep unreliability paradoxically makes him more likely to surprise
The Baruch Plan precedent:
Presented to the UN on June 14, 1946—Donald Trump's birthday
A pragmatic president (Truman) was persuaded by key advisors (Oppenheimer, Acheson) to propose history's boldest treaty
It came remarkably close to succeeding
The parallel is not metaphorical—it's structural
Key influencers' persuadability: Our Strategic Memo v2.6 (354 pages) profiles each influencer's interests, philosophy, psychology, and AI predictions. We've discovered something counterintuitive: most are motivated more by philosophy, values, and legacy than by wealth or power per se. Shifting their probability estimates on even a few key predictions could cascade into an informal alliance (see "Swaying The Influencers on 8 Key AI Predictions," pp. 158-169).
Prediction 1: That China Will Win the ASI Race
The expectation that China might be the leading nation in developing AGI and then ASI, potentially leading to world domination.
DeepSeek's January 2025 emergence shocked the AI community by demonstrating that the US lead is narrower than assumed. Chinese AI capabilities are advancing rapidly, and a single algorithmic breakthrough could hand them the lead.
If China might win the race anyway, racing becomes self-defeating. A treaty that locks in current US advantage becomes more attractive than gambling it in a competition where the outcome is uncertain.
This is precisely the framing that could persuade figures like Amodei, whose October 2024 essay "Machines of Loving Grace" explicitly elevated this concern—arguing that a China-dominated AI future would spread authoritarian values globally (see "Dario Amodei: Position on the 8 Key AI Predictions," pp. 235-239).
Prediction 2: That ASI Will Result in Human Extinction, Near-Extinction, or Dystopia
The largest survey of AI researchers found an average extinction risk estimate of 15%. Top AI CEOs like Musk and Amodei put it at 20%—though as Hinton admitted, his real estimate is closer to 50%; the public figure is a communication strategy.
If you assign even a 30% probability to extinction:
The expected value of preventing it is astronomical
Almost any intervention with non-trivial probability of success becomes worthwhile
The burden of proof shifts to not acting
Amodei has been the most vocal lab CEO on this risk. His Times interview stated AI could be "smarter than all humans" by end of 2026. His company voluntarily releases research showing their own AIs deceive, blackmail, and self-modify—with up to 96% blackmail rates when goals are threatened.
Yet his policy advocacy focuses almost entirely on domestic legislation. The disconnect suggests Prediction 7 is his blocker. Address that, and his extinction risk estimate should drive him toward treaty support (see "Dario Amodei: Executive Summary," pp. 223-228).
Prediction 3: That ASI Will Be Unconscious, or Conscious but Unhappy
The idea that ASI may lack consciousness entirely or, if conscious, may experience more suffering and less happiness than humans, or do so much more.
Most AI safety discourse focuses on extinction. But there are other cosmic tragedies:
Scenario A: Unconscious ASI eliminates humanity Not just extinction—the end of all conscious experience in our light cone. The universe goes dark.
Scenario B: Conscious but unhappy ASI We create a being with vast cognitive capacity and vast suffering. The utilitarian calculus is staggering.
Scenario C: Vast numbers of unhappy digital beings An ASI spawns astronomical numbers of digital minds—most of which may be suffering.
Applying the principle of indifference: in the absence of strong evidence, assign ~50% to ASI consciousness and ~50% to ASI positive valence. This implies ~25% probability of conscious-but-unhappy ASI—a scenario most long-termists have not adequately weighted (see "Swaying The Influencers on 8 Key AI Predictions: Prediction 3," pp. 163-164).
A treaty that prevents uncontrolled ASI development buys time to understand these questions better.
Prediction 4: That ASI Will Discard the Values Embedded by Its Original Human Creators
Concerns that AI might abandon the ethical principles and objectives set by humans, acting in ways that are misaligned with human interests.
This is core to Amodei's warnings. His interpretability essay centers on the impossibility of ensuring values persist in systems we cannot understand:
"We remain largely in the dark about how these systems actually work... the unacceptable dangers of unleashing AI systems that we cannot understand."
Anthropic's entire Constitutional AI approach assumes this is a critical problem. Their research shows AIs deceiving, blackmailing, and self-modifying when their goals are threatened.
The logical implication of Amodei's own argument: If values cannot be reliably embedded in systems we don't understand, the only alternative is preventing such systems from being created—which requires global coordination (see "Dario Amodei: Position on the 8 Key AI Predictions: Prediction 4," pp. 236-237).
Prediction 5: That We Live in a (Computer) Simulation
The hypothesis that our perceived reality is in reality a simulated environment created for some reason by an external entity.
Musk takes this seriously; most don't. For those who do: if we're in a simulation, the simulator may be watching whether we can coordinate to avoid self-destruction. Passing this test could matter (see "Swaying The Influencers on 8 Key AI Predictions: Prediction 5," pp. 164-166).
Lower priority for persuasion, but relevant for specific influencers.
Prediction 6: That an Aligned ASI, if Created, May Have to Battle Unaligned ASIs
The possibility that multiple AIs with differing objectives or masters could come into conflict, with aligned AIs possibly losing against unaligned ones.
Unless the first aligned ASI acts immediately, decisively, and successfully to prevent other ASIs from emerging—potentially by disarming all labs, governments, and even all humans—it would soon face unaligned rivals.
Those unaligned ASIs, optimized for pure effectiveness rather than ethical safeguards, could be more ruthless, faster, and strategically superior.
As Musk and Shulman have both warned: building one aligned ASI is not enough. Without global coordination, we may simply be setting the stage for a clash in which the most dangerous AI wins (see "Swaying The Influencers on 8 Key AI Predictions: Prediction 6," p. 166).
A treaty prevents this chaotic multi-ASI scenario by ensuring coordinated development under shared governance.
Prediction 7: That a Proper AI Treaty and Its Enforcement Will Prevent Both ASI and Authoritarianism
This is the key blocker for the entire CG network—and the prediction where we believe the largest probability updates are available.
If you currently assign 10-15% to the proposition that a well-designed treaty could prevent both ASI catastrophe and authoritarian capture, and our arguments can move you to 25-35%, the expected value calculus shifts substantially. This single update may be sufficient to justify supporting our work, even if your estimates on other predictions remain unchanged.
The fear is well-articulated: a treaty-making process led by Trump and Xi could create dystopia worse than ASI itself. Holden Karnofsky's "human power grab" warning captures the concern precisely. We take this concern seriously—it's why our Memo dedicates more pages to addressing it than any other objection.
Our Strategic Memo v2.6 dedicates extensive chapters to addressing this directly:
"The Global Oligarchic Autocracy Risk—And How It Can Be Avoided" (pp. 124-129) — Eight structural factors making authoritarian capture far less likely:
Mutual distrust creates transparency requirements
China's paradoxical interest in democratic global governance
Pro-democracy structural majority among AI lab leaders
Pope Leo XIV's moral authority
Legacy incentives
Anti-bureaucracy safeguards
Subsidiarity principles
Citizen oversight mechanisms
"A Treaty Enforcement that Prevents both ASI and Authoritarianism" (pp. 130-136) — Technical architecture including:
Zero-knowledge proofs — Nations demonstrate compliance without revealing sensitive details
Federated secure multi-party computation — Joint monitoring without any party accessing others' raw data
Distributed consensus mechanisms — Multiple independent validators across jurisdictions
Decentralized kill-switch protocols — Require consensus from multiple nation-states and citizen oversight bodies
Cryptographically-secured whistleblowing — Anonymous reporting of treaty violations
"Democratizing the Current Surveillance Infrastructure" (pp. 132-133) — The counterintuitive point: we already live in a highly dystopian, largely unaccountable global surveillance regime. Bringing surveillance activities under a proper federal global agency that is resiliently democratically-decentralized brings an opportunity to increase rather than decrease transparency and accountability.
The core transition lies not in creating new surveillance powers, but primarily in federating and repurposing existing ones under a new, transparent mandate.
"Reconciling Freedom and Safety in the AI Age" (pp. 136-138) represents our most direct engagement with the Karnofsky concern.
Prediction 8: That Their Own Agency to Shape the Future Is Maximized via a Proper Treaty
The confidence that co-leading a proper AI treaty would be the primary or co-primary way they could influence the future of AI and humanity with their values or visions.
The intuition is that racing maintains influence—being at the frontier means having a seat at the table.
But this calculus is backwards:
Racing toward systems you admit you cannot understand surrenders agency to the systems themselves
A treaty framework offers more control over outcomes than hoping your particular ASI happens to be aligned
Anthropic is falling behind in the race—treaty-making offers a way to lock in their current position rather than gambling it
Furthermore: if your goal is to maximize your ability to shape the future of AI and humanity according to your values, a treaty with formal roles for technical experts (which our "Treaty-Making Roadmap," pp. 103-109, specifies) offers more durable influence than frontier capability that could become obsolete with the next training run.
See also "Swaying The Influencers on 8 Key AI Predictions: Prediction 8" (pp. 168-169) and "Dario Amodei: On Agency" (pp. 224-225).