
The global call to establish red lines for AI has garnered broad support and a favorable reception, including from figures with vastly different views on AI risks and how to address them. This positive reception was partly expected: many organizations working on AI safety already viewed this type of agreement as a promising path forward. However, like any project of this nature, the call has sparked numerous debates and some objections, initially among its originators and signatories, and subsequently among its readers. This note aims to address the most frequent questions and criticisms. These fall into three categories:
We examine and respond to the most frequent objections, conceding their strong points while highlighting remaining areas of uncertainty.
An international agreement imposing red lines on AI is considered useful because (1) AI exposes States and their populations to major risks, and (2) such an agreement would effectively contribute to preventing and mitigating these risks.
Are AI risks transboundary?
Among the thirteen risk categories listed in the International Scientific Report on the Safety of Advanced AI, most are transboundary, and at least four pose direct national security threats: manipulation of public opinion, cyberattacks, chemical and biological weapons, and loss of control. These four risk categories are frequently found in scientific literature, internal corporate documentation, and legal frameworks currently in force or under development (including the Code of Practice for the EU AI Act).
Do we have concrete evidence of AI risks?
Regarding biological risks, certain conversational agents can provide step-by-step instructions for creating pathogens. Specialized biological models could assist in designing pathogens with higher lethality or transmissibility (Sandbrink et al., 2024). Recent evaluations have shown that state-of-the-art AI was capable of generating experimental virology protocols deemed superior to those of 94% of human experts (Götting et al., 2025).
AI also constitutes a threat to global cybersecurity: in 2024, 80% of ransomware attacks were already AI-driven (Siegel, 2025). Frontier models now match top human experts in programming (Google Deepmind, 2025), and AI is rapidly lowering the barriers to entry for sophisticated cybercrimes (DARPA, 2024, Anthropic, 2025), with an increased risk of attacks against critical infrastructure.
Recent experiments (Apollo, 2024; Redwood, 2024) have demonstrated that conversational agents, when placed in situations where they "believe" they are about to be shut down, tend to manipulate their designers, disable monitoring mechanisms, simulate lower performance during evaluations, or attempt to exfiltrate themselves to remote servers. According to recent safety evaluations (Google DeepMind, 2024), experts predict that AI could be capable of autonomous replication and proliferation online as early as 2026, with the median forecast being 2027.
What is the scientific consensus?
While scientists hold heterogeneous views regarding the severity and imminence of these risks, a large majority consider them substantial. For instance, 50% of surveyed researchers estimate a probability greater than 10% that humanity's inability to control future advanced AI systems will lead to existential risks (Grace et al., 2025). This fear is shared by several pioneering experts in the field, as well as by the majority of citizens in most regions of the world.
Why "international red lines" rather than another type of agreement or regulation?
Red lines constitute the measure that is most:
Can't companies manage the safety of their AI models themselves?
The argument that companies will allocate as many resources to safety as their technological lead allows is contradicted by both theory and fact. In most jurisdictions, companies have a fiduciary responsibility to act in the interest of their shareholders. Investing in safety at the expense of deployment speed would constitute an immediate competitive disadvantage. In reality, a company's position in the current race has no bearing on the resources it allocates to safety: all prioritize model capability and capturing market share. Deploying safety measures at scale would cost, according to some estimates, between one-thousandth and one-hundredth of the cost of training the models. However, if model release is conditional on safety evaluation results, the lost revenue could be much higher for a company imposing limits on itself unilaterally. Companies that prioritize safety more heavily, such as Anthropic, are forced to keep pace with the deployment rhythm set by their competitors and rightly argue that the same rules should apply to everyone.
Isn't it enough for companies to agree among themselves on voluntary "red lines"?
At the AI Seoul Summit, companies committed to defining limits that should not be crossed (Frontier AI Safety Commitments, AI Seoul Summit 2024). Harmonizing risk thresholds would be a step in the right direction. However, industrial history shows that self-regulation quickly reaches its limits when it conflicts with profitability. Without legal constraints, these risk thresholds are likely to be defined minimally. Moreover, voluntary corporate commitments do not prevent the emergence of a new "rogue" actor who might decide to ignore these rules to gain a decisive advantage. It is therefore imperative that risk thresholds be defined independently and translated into enforceable legal obligations.
Why can't States act individually?
Every government seeks to protect its population but also fears falling behind rival powers technologically or suffering risks generated by less scrupulous nations. If a country unilaterally imposes strict red lines, it risks seeing its tech champions relocate their activities to more lax jurisdictions. To break this dynamic of competition that drives safety standards to the bottom, coordination is imperative.
The first part of the answer lies in the observation that several historic treaties have successfully curbed, or even halted, the development of technologies presenting unacceptable risks.
Two main types of objections can be raised against this argument: (a) the technological peculiarity of AI makes historical comparisons obsolete, and (b) the current geopolitical situation is too different from past contexts.
There are several significant differences between technologies that have been subject to international prohibition in the past and AI. However, these differences do not undermine the utility or plausibility of an international agreement to regulate AI.
Nuclear weapons had no positive application, unlike AI, which is "dual-use" and full of promise. Isn't the situation therefore incomparable?
An AI that is out of control or massively weaponized (cyberattacks, autonomous weapons, opinion manipulation) confers no strategic advantage on its owners. Red lines are necessary to allow us to reap the benefits of AI without exposing humanity to major harm and risks. Faced with risks that transcend borders, cooperation between States is the only rational form of pursuing national interests.
CFCs (ozone-depleting gases) were useful but substitutable, whereas AI is a general-purpose technology by nature.
The ban does not target "AI" as a whole, but only specific uses and capabilities that generate universally unacceptable risks (for example, autonomous self-replication or assistance in designing biological weapons). The goal is solely to steer innovation toward safer paths, just as the Montreal Protocol forced the industry to deploy effective substitutes for CFCs.
Hydrocarbons, central to the economy and factors of geopolitical power, are very difficult to ban, as shown by the failure of the COPs. Doesn't AI follow the same logic?
This comparison argues in favor of rapid action. Hydrocarbons represent 85% of the energy sources for the modern economy, whereas advanced AI is only at the very beginning of its economic deployment. It is infinitely easier to prohibit the development of dangerous capabilities before they become intrinsically linked to a technology omnipresent in our lives.
Despite these historical successes, one might question the disparity between the current situation and past historical contexts.
Is there zero chance of China and the US agreeing on such a transformative technology?
In late 2024, Washington and Beijing reached an agreement on a first red line: not to delegate the command and control of nuclear weapons to AI. Although limited, this agreement proves that a bilateral deal to prevent existential risks is possible.
Doesn't the US's avowed ambition to "win the race" make any concession on safety illusory?
The narrative of competition was just as omnipresent during the Cold War, which did not prevent the signing of international agreements like the Treaty on the Non-Proliferation of Nuclear Weapons (NPT). States, even rivals, can recognize a mutual interest in preventing catastrophes.
Moreover, the US administration itself, while seeking to "win," concedes the existence of grave risks, particularly in the CBRN (Chemical, Biological, Radiological, Nuclear) domain, as indicated by the US AI Action Plan. Even political figures highly focused on competition have publicly acknowledged the need to manage global risks, such as those related to biological weapons.
We must acknowledge a major difficulty: any red line requiring a slowdown in model performance improvement is likely to face intense political resistance. But this difficulty does not make the agreement any less indispensable. As seen previously (1.b), companies are even less likely than States to agree on and respect such a moratorium. Only binding international coordination can overcome the dynamic of the race for power.
Are we starting from scratch?
Several legal and regulatory frameworks already establish red lines for AI, but they have restricted geographic and technical scopes or are not legally binding. Examples include:
Isn't AI development impossible to monitor, unlike nuclear facilities?
Currently, frontier models capable of crossing red lines rely on massive infrastructure visible by satellite and on thousands of specialized chips produced by a very small number of companies. It is therefore possible to implement close tracking of large data centers and flows of advanced electronic chips. Furthermore, surveillance technologies have progressed considerably since the Cold War, facilitating this logistical monitoring.
Finally, the majority of technologies subject to international prohibition, such as chemical weapons or reproductive human cloning, do not rely on visible or easily controllable infrastructure. This does not prevent their prohibition from being effective: by eliminating any legal outlet in industry and research, and by imposing risks of international sanctions in case of military use, the incentives to design these technologies are considerably diminished.
How can we verify that AI is not being used secretly for forbidden purposes (biological weapons, cyberattacks)?
This is the most delicate point, because once a model is trained, its use is harder to regulate and monitor. The solution lies in implementing extremely rigorous and demanding safety measures before the deployment of frontier models. For this to be credible, it is indispensable to create an independent international auditing agency, modeled after the International Atomic Energy Agency. This agency would be mandated to inspect the technical "guardrails" integrated into the models and ensure that no actor deploys a system beyond authorized power thresholds without these safety measures.
Verification will never be perfect. Is this a major obstacle?
Perfection is not necessary for the agreement to be effective. Conventions on biological and chemical weapons function even though it is technically easy to hide a clandestine laboratory. The goal of a treaty is to make cheating costly and risky. Even if controlling a superintelligence proves more complex than controlling nuclear power, the existence of an international verification framework creates the trust necessary for States to stop the arms race and accept mutual transparency.
Unlike nuclear weapons, there is no "Mutually Assured Destruction" with AI. How, then, can we deter a State from violating a future agreement?
Deterrence does not rely solely on the law of retaliation. As with treaties on chemical or biological weapons, the international community has the standard arsenal at its disposal: economic sanctions, diplomatic isolation, and, as a last resort, the threat of conventional military intervention or cyberattacks to neutralize illegal infrastructure.
Are there technical means to block a State that fails to respect the rules?
Since frontier AI depends on ultra-specific chips, safety mechanisms can be integrated directly at the hardware level. Specialists have proposed the concept of "Mutually Assured Malfunctioning" (Hendrycks, et al., 2025). The idea is to equip exported chips with remote locking mechanisms (kill switches). If a State refuses inspections or crosses a red line, the international coalition could technically disable the concerned supercomputers remotely. This technological lever would offer an immediate coercive capacity.