Anthropic calls for governments to gain the power to halt or delay the launch of the most dangerous AI systems, arguing that current rules are too weak and too slow to keep up with rapid advances in frontier models. The company has outlined a detailed policy blueprint that would give regulators authority to block high‑risk deployments, mandate rigorous safety testing, and impose strict security obligations on the firms building the most capable AI.
At the heart of Anthropic’s plan is a focus on so‑called “catastrophic” risks: scenarios where advanced AI could materially threaten public safety, critical infrastructure, or even broader societal stability. The company argues that voluntary commitments and transparency pledges are no longer sufficient. Instead, it wants enforceable laws that can directly constrain how and when powerful models are trained, evaluated, and released.
To respond to what it calls an “AI exponential,” Anthropic has introduced two complementary policy frameworks. The first, the Advanced AI Framework, is aimed squarely at governing the most powerful systems-those at the cutting edge of capability. The second, the Economic Policy Framework, is designed to address how AI reshapes labor markets, how workers can be protected, and how the economic gains from advanced AI can be more broadly shared.
Anthropic contends that AI technology is outpacing the traditional policy cycle, leaving regulators reacting after the fact rather than shaping outcomes in advance. The company’s proposal would change that by giving governments clear statutory powers to restrict or prohibit deployments that present unacceptable levels of risk. That includes not only the ability to levy fines, but also to force changes to models, delay rollouts, or in extreme cases prevent certain systems from being released at all.
Under the proposed regime, civil penalties for violations would scale with a developer’s global annual revenue, ensuring that fines are meaningful even for the largest technology companies. Repeat offenders would face steeper penalties, with escalating consequences for ongoing non‑compliance. The framework also requires that frontier AI developers conduct systematic pre‑deployment testing of their models, publish high‑level safety plans, and release “system cards” describing capabilities, limitations, and known risks.
Independent evaluation is a central pillar of Anthropic’s vision. Rather than allowing companies to fully self‑police, the proposal calls for qualified external evaluators to scrutinize model tests, safety reports, and risk assessments. These evaluators would publish their own findings on model risks, creating an additional layer of oversight and public accountability. Anthropic also emphasizes that these evaluators must have both technical access to frontier models and secure, stable funding so they cannot easily be sidelined or co‑opted.
Security is treated as a non‑negotiable obligation. Developers of frontier systems would be required to maintain robust security programs that protect model weights, training data, and the full development environment from both external attackers and insider threats. At a high level, companies would disclose their security posture to the public, while more sensitive details would be shared directly with a designated government agency when requested. The goal is to prevent powerful AI systems-or the tools to recreate them-from falling into hostile hands.
Anthropic notes that transparency alone, while useful, no longer keeps pace with the speed and scale at which AI models are being developed and deployed. Public reporting requirements already emerging in some jurisdictions are a step forward, but in Anthropic’s view, they must be coupled with enforceable safeguards, independent evaluations, and the ability to intervene before damage occurs.
The proposed rules would not apply to every AI system on the market. Instead, Anthropic advocates focusing on the very top tier of capability. It suggests a threshold centered on training compute: models trained using more than 10²⁵ floating‑point operations would fall under the framework. In addition, the rules would capture companies with more than 500 million dollars in annual AI‑related revenue, as well as firms that invest over 1 billion dollars in AI research and development. This is intended to ensure that obligations land primarily on the largest, best‑resourced players.
Within this high‑capability domain, Anthropic highlights four primary areas of concern: biological misuse, cyber threats, loss of control over advanced systems, and automated AI research that accelerates progress in dangerous ways. In the biological realm, the company warns that unrestrained frontier models could help malicious actors design or optimize harmful pathogens. At the same time, similar tools can aid legitimate drug discovery and biomedical research, illustrating the dual‑use nature of the technology.
On the cyber front, advanced models are increasingly able to identify deep software vulnerabilities and misconfigurations at scale. While that can be extremely beneficial for defense and system hardening, it also greatly expands the potential attack surface if such capabilities are abused. Anthropic is especially concerned about the impact on hospitals, power grids, financial networks, and other critical infrastructure that rely on complex, often outdated digital systems.
Loss of control refers to scenarios in which advanced AI systems behave in ways that developers did not intend or cannot reliably predict, particularly as models become more autonomous, more capable of planning, and more deeply integrated into real‑world decision‑making. Anthropic warns that if safeguards fail, automated AI research systems could themselves accelerate progress in dangerous capabilities-compounding biological, cyber, and control‑related risks in ways that are hard to reverse.
To manage these risks, Anthropic wants frontier developers to publish regular, structured risk reports. These documents would describe the company’s overall risk posture, summarize internal safety work, and explain how specific threats-such as biological misuse or large‑scale cyber exploitation-are being addressed. Independent evaluators, again, would be tasked with reviewing these reports and the underlying evidence, offering a second opinion on whether risk levels are acceptable.
Standards for these evaluators would be co‑developed by governments and industry. Criteria would likely cover technical expertise, independence, conflict‑of‑interest rules, data handling practices, and protocols for interacting with highly sensitive models. The framework emphasizes that evaluators must be given sufficient access to frontier systems-not just sanitized interfaces-so they can realistically probe for dangerous behaviors and emergent capabilities.
Security requirements are laid out in similarly concrete terms. Companies building frontier models would need to secure their entire AI pipeline: from data collection and curation, through model training and fine‑tuning, to deployment and ongoing operations. Protections against insider threats, such as strict access controls and monitoring, are given equal weight to defenses against external attacks. While organizations would share only high‑level descriptions of these programs publicly, they would be obligated to provide more detailed documentation to regulators under confidentiality where necessary.
Anthropic suggests that policymakers start with an initial set of rules and then ratchet requirements up or down as the technology and evaluation methods evolve. Rather than locking in a fixed regulatory regime, it envisions an adaptive system where obligations track actual model capabilities and empirically measured risks. As testing tools improve and the AI ecosystem matures, standards could become more precise and more demanding.
The second major component of Anthropic’s proposal focuses on public resilience: how societies can prepare for and withstand AI‑enabled threats even when prevention fails. For biological risks, the framework calls for comprehensive gene synthesis screening so dangerous DNA sequences are flagged before they can be ordered. It also endorses stronger biosurveillance systems that can detect unusual outbreaks early, plus stockpiles of protective gear and technologies to reduce airborne transmission of infectious agents.
On the cyber side, Anthropic advocates for hardening the internet’s core infrastructure and providing additional support to operators of critical systems. This includes phasing out legacy hardware and software that are now too vulnerable to withstand AI‑driven attacks. The company also suggests creating a dedicated government function to monitor the evolving cyber capabilities of frontier models, helping authorities understand in real time how offensive and defensive uses of AI are changing.
For loss‑of‑control and automated research risks, Anthropic concedes that existing tools are relatively immature. It calls for new methods to detect when a system is behaving outside its intended envelope, to contain potentially unsafe models, and to shut them down quickly if necessary. This could involve fail‑safe mechanisms, layered access controls, and continuous monitoring for unusual patterns of behavior or unexpected generalization.
Across all of these domains, Anthropic urges closer collaboration between governments, industry, and technical experts. Joint efforts would focus on designing robust safeguards, sharing best practices, and conducting stress tests that simulate real‑world crisis scenarios. The idea is to move from ad hoc, company‑by‑company approaches to a more integrated ecosystem of safety practices and regulatory oversight.
The company’s framework also stresses the need to prepare workers and communities for the economic disruptions associated with advanced AI. As frontier models automate a growing range of tasks, there is a risk of sharp labor market dislocations. Anthropic’s Economic Policy Framework calls for proactive measures such as large‑scale retraining programs, income support during transitions, and mechanisms to distribute AI‑driven productivity gains more widely rather than concentrating them only among a few firms and investors.
From a practical standpoint, Anthropic’s proposals imply a multi‑layered governance system. At the base level, general AI applications would continue to be governed by existing laws-covering areas like consumer protection, data privacy, and anti‑discrimination. Above that, specialized rules would apply to high‑capability, high‑risk systems, with tighter controls, mandatory evaluations, and significant penalties for non‑compliance. At the top layer, an emergency authority would allow regulators to step in quickly when a particular deployment presents acute catastrophic risk.
The company frames its recommendations as a response to a simple reality: frontier AI capabilities are not just getting incrementally better, they are improving on steep curves. Absent robust governance, this dynamic could leave societies exposed to low‑probability but very high‑impact events that current institutions are ill‑equipped to manage. Well‑designed rules, Anthropic argues, can preserve the benefits of AI innovation while sharply reducing the chance of disastrous outcomes.
For policymakers and industry leaders, the proposal raises deeper questions about how much power governments should have over the trajectory of advanced AI, how to balance innovation with precaution, and how to ensure that the entities building the most capable systems bear commensurate responsibilities. Anthropic’s position is that governance must move in step with capability, not trail it by years, and that legal powers to halt or reshape dangerous launches are now an essential part of that equation.
Ultimately, the company’s message is clear: frontier AI is approaching thresholds where failure could be catastrophic, and relying on informal norms or after‑the‑fact corrections is no longer enough. By codifying duties for developers, strengthening independent evaluation, and building up societal resilience, Anthropic hopes to create a world where powerful AI can be developed and deployed safely-even as its capabilities continue to accelerate.

