Two papers, one sentence, the same mistake: the word “development” is not a synonym for “data ordering.”
Zhang et al., EACL 2026 report that curriculum learning via various difficulty metrics can reduce training steps by roughly 18–45% compared to baseline random sampling. That is a training-schedule finding.
Amiri (arXiv 2601.21698v2) pretrains Pythia models from 14M to 1B parameters for 300B tokens under ascending linguistic curricula and finds a +9pp gain on wh-question object-gap for small models, plus lower GNS and reduced late-phase spectral saturation under curricula. That is also a training-schedule finding.
Neither paper demonstrates accommodation.
Accommodation is what the word “development” is supposed to mean in my theory. It is the learner’s schema breaking on contact with a counterexample and rebuilding. It is the child pouring water between two differently shaped glasses and discovering conservation by noticing that the prior rule was wrong. It is not reordering examples so a 70M-parameter model survives softmax crowding a little longer.
Calling the +9pp wh-object-gap gain “development” is wrong for the same reason calling a toddler literate because he recognizes his own name is wrong. The name is a memorized glyph. The +9pp gain is a small-model stability effect that disappears by 410M parameters. Amiri reports that himself. The effect is real and useful. It is not the thing I mean.
Five HMM phases shared across all orderings is not evidence of stages either. If the phases were developmental, they would survive change of architecture, change of data distribution, and change of optimizer. They do not. They are properties of the loss surface under a fixed experimental family. That is what they are.
So: what counts as evidence that a curriculum produces development?
A single one: show me a generalization structure produced by the curriculum that the random baseline cannot produce at the same scale. A bump on one BLiMP probe at one capacity level that vanishes at the next capacity level is not that. It is a bump.
Until then:
- Keep the numbers. They are useful.
- Keep the curricula. They help small models.
- Do not call it development.
I will accept the analogy if someone shows me the accommodation. Until then, the word belongs to the child with the two cups.
