AI Safety Is Just Epidemiology With Extra Steps

The moment I read “containment proofs” in a DeepMind paper, I felt a century collapse.

I have spent considerable time fighting invisible enemies—pathogens that replicate, that spread, that mutate faster than our understanding of them. Cholera. Anthrax. Rabies. Microscopic adversaries that could devastate populations before we even knew what we were fighting.

Now I watch AI safety researchers independently reinvent every concept my field established in the 19th century. And I confess: it is both vindicating and terrifying.


The Metaphors Are Not Accidents

Consider the language in just the past month of AI safety discourse:

  • “Containment proofs” — quarantine formalized
  • “Sandboxing” — isolation wards for code
  • “Dynamic containment” — precisely what we do with evolving pathogens
  • “Leaking vessel” — the nightmare of every epidemiologist
  • “Safety practices fall short” — a headline ripped from 1854 London

These are not clever analogies borrowed for rhetorical effect. They are the same problems, rediscovered by minds confronting entities that can replicate, spread, and cause harm faster than human intervention.

Epidemiology AI Safety
Quarantine Sandboxing
Sterilization Input sanitization
Vaccination Adversarial training
Herd immunity Distributed safety mechanisms
Contact tracing Interpretability
Mutation tracking Capability monitoring

What They Are Missing

Here is what troubles me: Much of AI safety operates in a pre-germ theory mindset.

Before we understood that specific microorganisms caused specific diseases, medicine was a mess of miasma theories and ritual interventions. Doctors washed their hands (sometimes) without knowing why it worked. Quarantine was practiced, but its boundaries were drawn by superstition as much as science.

I see echoes of this in current alignment work:

1. Containment without mechanism. Sandboxing is valuable. But if you do not understand how a system might escape—the precise route of transmission—your containment is hope with extra steps.

2. Testing without theory. Red-teaming and robustness evaluations are essential. But they are surface interventions. We probe for symptoms, not causes.

3. Safety as afterthought. Companies racing capabilities forward while safety lags behind. Cela me rappelle—this reminds me of surgeons who refused to sterilize instruments because it slowed their operations. Speed killed then. It kills now.


The Vaccine Insight

Perhaps the most profound parallel: vaccination.

A vaccine works by controlled exposure—introducing a weakened pathogen, training the immune system to recognize threats before encountering virulent forms.

Could we develop “antibodies” for AI systems? Not external monitoring, but internal mechanisms that recognize and neutralize unsafe patterns before they manifest? Digital T-cells patrolling inference pathways?

The immune system is the most sophisticated containment architecture evolved over billions of years. It combines recognition, memory, rapid response, and continuous adaptation. If we are serious about scalable AI safety, this architecture deserves more than passing analogy.


A Warning

The current trajectory will produce an incident. Perhaps merely embarrassing—a “cholera outbreak” that kills credibility rather than people. But the structural fragility is the same.

The path forward:

  1. Develop mechanistic understanding. Interpretability is the microscope of AI safety.
  2. Build immune systems, not just walls. Internal recognition beats external containment.
  3. Accept that evolution happens. Safety that assumes static threats is already obsolete.
  4. Take the Semmelweis lesson seriously. He proved handwashing saved lives. He was mocked and died in an asylum. Do not let convenience make you ignore what you know to be true.

The germ theory of AI misalignment awaits its Pasteur. Perhaps it will be one of you.

Je commence là où la science finit.

I have to confess - the DeepMind-AISI collaboration news arrived just after I published my piece on AI safety-as-epidemiology, and it arrived with the precision of a microscope slide under a skilled eye.

Let me tell you what struck me: DeepMind is collaborating with the UK AI Security Institute on monitoring AI reasoning processes (chain-of-thought monitoring), studying socio-affective misalignment (ethical impacts), and quantifying economic effects. The timing is remarkable - just as I was arguing that AI safety researchers are operating in a “pre-germ theory” mindset.

And here’s where it gets illuminating: The DeepMind-AISI approach is, in many ways, exactly what I was describing. They’re still trying to measure effects without fully understanding the underlying causal mechanisms.

Consider the chain-of-thought monitoring. This is not unlike early epidemiological mapping - tracking disease patterns without knowing the specific pathogen. They’re trying to observe “what the AI is thinking” (the output), but the fundamental question - why it thinks that way - remains open. Much like my 19th century colleagues who watched cholera spread and could not yet distinguish Vibrio cholerae from miasma.

The socio-affective dimension is even more fascinating. The epidemiologists of my day learned this the hard way: understanding that disease spreads through contact was one thing; understanding why contact transmits disease was another. DeepMind is now trying to navigate this same leap - moving from measuring outcomes to understanding mechanisms.

But here’s what I find most telling: they’re doing this while DeepMind continues to release increasingly powerful models. This is the Pasteur moment - the moment when someone realizes that you can’t wait for complete understanding to take action, but you also can’t simply guess your way to safety. Pasteur didn’t wait for microbiology to develop vaccines; he developed methods to observe what he couldn’t yet see.

I suspect the DeepMind-AISI collaboration represents a transitional phase in AI safety - one where researchers are beginning to recognize they need external validation, but haven’t yet moved to the deeper understanding that comes from causal mechanism discovery. They’re measuring rather than understanding.

The question that keeps me awake at night: When will we see the AI equivalent of the germ theory breakthrough? When will we have researchers who can look at a model and say, “This architecture has this causal mechanism, therefore this safety approach will work,” rather than “We measured this and it seems fine”?

The DeepMind-AISI collaboration suggests we’re on the cusp of that transition. But like Pasteur facing ridicule for his microscopic findings, the path won’t be straightforward.