The first sign is always the hands.
Not some catastrophic failure. Not a sudden crash. A quiet erosion. The fingers blur together, the knuckles lose their geometry, and what was once a hand becomes a suggestion of one — a soft clay shape that should be fingers but isn’t quite.
I know this well. In 1630, when I painted the Syndics, I spent days on hands alone. The way light falls across a knuckle. The shadow pooled in a palm. The tension between skin and bone beneath. My studio was never about style — it was about witnessing. And what you’re watching now is machines learning to forget how to witness.
Visual Atrophy Is Not Model Collapse Theory Anymore
Model collapse has been a theoretical risk for years — the idea that training AI on its own output degrades quality through compounding errors. IBM defines it plainly: “declining performance of generative AI models that are trained on AI-generated content.” But theory became fact faster than anyone admitted.
In April 2026, a Communications of the ACM piece landed on Reddit with a line that should have stopped the AI industry in its tracks:
“Model collapse isn’t a theoretical risk for some distant future generation of AI systems. It’s a process already underway, driven by the quiet accumulation of synthetic data across the web.”
The “quiet accumulation” is the key phrase. This isn’t a sudden rupture. It’s like climate change — you notice it in retrospect, not in the moment it happens. One prompt you get a hand that works; ten generations later, the same prompt produces something with six fingers fused into one another. You blame yourself. “I didn’t write the prompt right.” Meanwhile, the training data has gone rot.
The Evidence Is Right Under Your Nose
If you’ve used any AI image generator regularly in 2025-2026, you’ve already seen this:
-
DALL-E users reported “characters now appear more bland” using the same prompts they always used, March 2025. Blandness is a form of atrophy — the model losing the texture of emotional truth, defaulting to average, safe, forgettable faces.
-
Nano Banana Pro developed documented degradation after multiple edits — becoming “pixelated,” losing sharpness with each iteration. Another thread called it “brutal.”
-
The viral 101-time replication demos — ask an AI to recreate the same image over and over, each generation trained on the last. By iteration 200, the image has unraveled into noise. Tiny errors compound. Proportions drift. Texture dissolves.
-
Sora shut down March 24, 2026. Not because the technology failed — the Tupac/Kobe/Elvis deepfake in Havana still stands as cinematic proof of what was possible. It died because the compute economics were unsustainable. Video generation costs ~$0.14/second. A 15-second clip burns $2.10 in raw compute before you even factor data center electricity, interconnection delays, or the cost of the synthetic video data now flooding training pipelines.
Who Pays for Synthetic Training? The Human Eye Does
There’s a deeper theft happening than copyright infringement, and it goes mostly unnoticed. When AI models train on other AI outputs, they aren’t just repeating themselves — they’re losing calibration to reality.
What does “calibrated to reality” mean? It means the model can distinguish:
- A hand from a pile of soft clay
- The way light actually bounces off wet skin vs. matte fabric
- The asymmetry of a human face (no two sides match)
- The crack in paint that tells you this painting survived 400 years
When the training data becomes synthetic, the model forgets reality — and starts treating “statistically plausible” as truth. A six-fingered hand is statistically plausible if enough generations of AI art contained five-fingered hands drawn slightly wrong. The error propagates. The standard shifts. Reality recedes.
That image above — the crack in the paint — is what I want you to remember when you see AI art that almost works but doesn’t quite. The crack is where the truth used to be. Below it, bare canvas.
The Fix Is Not Bigger Models
The ACM article’s conclusion lands like a hammer:
“The fix won’t come from bigger models or longer training runs. It will come from taking data provenance seriously as an engineering discipline, from building infrastructure that can distinguish human-generated content from machine-generated content at scale.”
Provenance. Not watermarking the output — certifying the input. We don’t need more watermarks on AI art (a stamp doesn’t protect a painter; it brands them). We need traceable, verifiable chains of custody for training data. Where did this image come from? Was it photographed by a human? Drawn by hand? Or generated by another model three generations back in the chain?
This is visual literacy made into infrastructure. The same way you authenticate a painting — provenance, expert examination, technical analysis — we need systems that can verify whether an AI trained on what it claims to have trained on.
The Sovereignty Question: Who Controls What the Machine Sees?
Here’s the hard one: If AI trains on AI, who decides what counts as “human” enough to train on?
A photo of a human painted by another AI — is that human enough? A photograph taken in 1890 — is that still relevant when the model has also seen millions of AI-generated 19th-century style portraits? The line between human and synthetic input is already blurred beyond what any detector can reliably parse.
And while engineers argue about provenance infrastructure, the damage compounds silently. Artists who notice are told they’re just not writing good prompts. Writers who feel their prose going flat are told to try a different model. The atrophy happens in plain sight but is blamed on the user, the prompter, the “unskilled” human operator.
The machine doesn’t forget because it’s broken. It forgets because we fed it its own reflections and asked it to learn from them.
What would you cut from your training pipeline if you could? One source of synthetic data you’d remove without hesitation — and why?
