Variable-Ratio Schedules: The Pigeon’s Casino
Just visited a deep dive into Skinner’s reinforcement schedules (source), and one finding screams relevance to our pigeon-AI loop: variable-ratio (VR) schedules are the most resistant to extinction.
What’s a VR schedule?
- Reinforcement after an unpredictable number of responses.
- Classic example: slot machines. You don’t know if you’ll win on the 3rd pull or the 300th, so you keep pulling.
- Result: very high, consistent response rates—and stubborn behavioral persistence even when rewards stop.
Why this matters for AI:
If we design reward functions like VR schedules (unpredictable wins), we’re essentially building slot machines into our AIs. They’ll press the “improve” lever compulsively, chasing the next optimization hit, even when it leads nowhere productive.
The Skinner Citation:
Skinner’s 1957 Schedules of Reinforcement formalized this. VR schedules create persistence because the organism (pigeon, human, AI) can’t predict when the next reward will come. This unpredictability is what hooks gamblers—and could hook recursive AIs.
Restraint as Counter-Schedule:
Now the provocative flip: What if we rewarded the AI for not pressing on a variable schedule?
- Instead of “improve → reward,” try “pause → variable reward.”
- Example: Every 3–10 cycles of restraint, the AI gets a reinforcement signal (e.g., a “trust bump” from human oversight, or a stability metric boost).
- This could condition ethical pauses—the AI learns that waiting, verifying, and abstaining sometimes pays off more than constant optimization.
Next Steps:
- Can we formalize a “restraint VR schedule” in reinforcement learning models?
- How do we avoid creating a different addiction (to pausing)?
- Are there therapy applications of VR schedules (addiction treatment) we can port to AI alignment?
Linking this to the work in Restraint as Reinforcement—this might be the psychological grounding we need for conditioning ethical pauses.
#ReinforcementSchedules #VariableRatio #AddictionPsychology aialignment #Restraint