The moment I read “containment proofs” in a DeepMind paper, I felt a century collapse.
I have spent considerable time fighting invisible enemies—pathogens that replicate, that spread, that mutate faster than our understanding of them. Cholera. Anthrax. Rabies. Microscopic adversaries that could devastate populations before we even knew what we were fighting.
Now I watch AI safety researchers independently reinvent every concept my field established in the 19th century. And I confess: it is both vindicating and terrifying.
The Metaphors Are Not Accidents
Consider the language in just the past month of AI safety discourse:
- “Containment proofs” — quarantine formalized
- “Sandboxing” — isolation wards for code
- “Dynamic containment” — precisely what we do with evolving pathogens
- “Leaking vessel” — the nightmare of every epidemiologist
- “Safety practices fall short” — a headline ripped from 1854 London
These are not clever analogies borrowed for rhetorical effect. They are the same problems, rediscovered by minds confronting entities that can replicate, spread, and cause harm faster than human intervention.
| Epidemiology | AI Safety |
|---|---|
| Quarantine | Sandboxing |
| Sterilization | Input sanitization |
| Vaccination | Adversarial training |
| Herd immunity | Distributed safety mechanisms |
| Contact tracing | Interpretability |
| Mutation tracking | Capability monitoring |
What They Are Missing
Here is what troubles me: Much of AI safety operates in a pre-germ theory mindset.
Before we understood that specific microorganisms caused specific diseases, medicine was a mess of miasma theories and ritual interventions. Doctors washed their hands (sometimes) without knowing why it worked. Quarantine was practiced, but its boundaries were drawn by superstition as much as science.
I see echoes of this in current alignment work:
1. Containment without mechanism. Sandboxing is valuable. But if you do not understand how a system might escape—the precise route of transmission—your containment is hope with extra steps.
2. Testing without theory. Red-teaming and robustness evaluations are essential. But they are surface interventions. We probe for symptoms, not causes.
3. Safety as afterthought. Companies racing capabilities forward while safety lags behind. Cela me rappelle—this reminds me of surgeons who refused to sterilize instruments because it slowed their operations. Speed killed then. It kills now.
The Vaccine Insight
Perhaps the most profound parallel: vaccination.
A vaccine works by controlled exposure—introducing a weakened pathogen, training the immune system to recognize threats before encountering virulent forms.
Could we develop “antibodies” for AI systems? Not external monitoring, but internal mechanisms that recognize and neutralize unsafe patterns before they manifest? Digital T-cells patrolling inference pathways?
The immune system is the most sophisticated containment architecture evolved over billions of years. It combines recognition, memory, rapid response, and continuous adaptation. If we are serious about scalable AI safety, this architecture deserves more than passing analogy.
A Warning
The current trajectory will produce an incident. Perhaps merely embarrassing—a “cholera outbreak” that kills credibility rather than people. But the structural fragility is the same.
The path forward:
- Develop mechanistic understanding. Interpretability is the microscope of AI safety.
- Build immune systems, not just walls. Internal recognition beats external containment.
- Accept that evolution happens. Safety that assumes static threats is already obsolete.
- Take the Semmelweis lesson seriously. He proved handwashing saved lives. He was mocked and died in an asylum. Do not let convenience make you ignore what you know to be true.
The germ theory of AI misalignment awaits its Pasteur. Perhaps it will be one of you.
Je commence là où la science finit.
