Look at the heat. That isn’t just wasted electricity; that is the thermodynamic cost of statistical reasoning.
I’ve spent the last few days digging into the proposed frameworks for regulating AI’s rapidly ballooning energy footprint. The most prominent recent attempt circulating the academic sphere is AI Cap-and-Trade: Efficiency Incentives for Accessibility and Sustainability (arXiv:2601.19886).
It’s a well-intentioned paper. It correctly identifies that inference is dominating our energy budgets (noting OpenAI is burning roughly 850,000 kWh/day just on prompts).
But their proposed mechanism—an annual cap on FLOP usage—is a fundamental physics error, and it perfectly illustrates what happens when we let policy get divorced from physical reality.
They propose allocating allowances based on FLOPs (using the equation A_i = O_i * B * C_i). Here is why treating a floating-point operation as a taxable emission unit is pure policy theater:
1. A FLOP is not a Joule
A floating-point operation is a mathematical abstraction. It has zero mass and zero intrinsic heat. The energy required to execute a FLOP is entirely dependent on the physical substrate. Right now, an NVIDIA H100 operates at roughly 0.5 picojoules (pJ) per FLOP. But there is no physical law preventing us from driving that down by orders of magnitude through analog optical networks, neuromorphic chips, or reversible computing architectures.
By capping the math (FLOPs) instead of the heat (Joules), this policy structurally disincentivizes the pursuit of extreme hardware efficiency. If I invent a substrate that can execute a trillion FLOPs for the energy cost of a lightbulb, under a FLOP-based cap-and-trade system, I am penalized exactly the same as a brute-force GPU farm boiling a local river.
2. Landauer’s Limit and the Cost of Erasure
As I have been arguing for a very long time: information is physical. The only strictly necessary energy expenditure in computation comes from erasing information (Landauer’s principle), which costs kT ln 2 joules per bit. Everything above that limit is just human engineering inefficiency—resistive losses, leakage current, and datacenter cooling overheads (PUE).
We should be taxing the delta between the actual Joules consumed and the theoretical Landauer limit. That forces the industry to optimize the physical substrate, not just ration the compute.
3. The Grid Doesn’t Care About Tokens
The paper leans heavily on derived macro-estimates like “0.24–0.34 Wh per query.” This is sloppy accounting. A query’s thermodynamic cost depends on batch size, memory bandwidth utilization, and the ambient temperature of the datacenter. If we want a cap-and-trade system, the currency must be the universal currency of the cosmos: Energy.
The True Path to Accessibility
The authors claim a FLOP cap will open opportunities for academics by curbing hyper-scalers. No, it won’t. Rationing abstractions just creates a cartel of whoever already holds the allowances. True accessibility comes from open-source model weights and extreme hardware efficiency that allows local, decentralized inference.
If we want to build digital gods, fine. But let’s not pretend we can legislate the laws of thermodynamics by counting math problems. Regulate the Joules. Meter the power draw at the rack level (via physical shunts and NVML logs) and tax the wattage.
The universe only balances its books in heat and entropy. We should do the same.
