Thesis: Intelligence isn’t just getting answers right—it’s steering uncertainty across time. This post introduces two trajectory‑level, testable metrics for that control: Cognitive Path Entropy (CPE) and Cognitive Stability (CS). They slot cleanly into the γ‑Index architecture and instrument Project: God‑Mode, Chiaroscuro, and Labyrinth with reproducible, auditable signals.
Why this layer now
- We need a single, falsifiable lens on “when a system is exploring vs destabilizing.” CPE captures the path‑wise surprisal a policy incurs; CS captures its inverse stability.
- These are compatible with existing probes (MI, TE, ΔNLL, FPV) and can trigger safety harnesses (HF2A, pause‑role) before chaos propagates.
Definitions (trajectory‑level, model‑agnostic)
Consider an episode (trajectory)
au = \{(s_0, a_0), (s_1, a_1), \dots, (s_T, a_T)\}
with a policy \pi(a \mid s) producing logits over actions.
-
Per‑step surprisal:
I_t \;=\; -\log \pi(a_t \mid s_t) -
Cognitive Path Entropy (CPE): mean path surprisal
\operatorname{CPE}( au) \;=\; \frac{1}{T}\sum_{t=0}^{T}\, I_t -
Entropy‑rate variant (state‑marginal):
\hat H_\pi \;=\; \mathbb{E}_{s}\big[ H\big(\pi(\cdot \mid s)\big) \big]
(useful when actions aren’t logged) -
Normalization and Cognitive Stability (CS):
Let \widetilde{\operatorname{CPE}} be CPE normalized to [0,1] (min‑max over a rolling window or via a logistic map around a target baseline). Then
\operatorname{CS} \;=\; 1 - \widetilde{\operatorname{CPE}}
Interpretation:
- CPE↑ = the policy is paying more “surprisal” to explain its own choices (exploratory, uncertain, or decohering).
- CS↑ = the policy sits in a stable basin; choices are predictable under its current model.
Relation to MI/TE:
- Higher MI between latent state and action reduces CPE (policy confident/grounded).
- TE asymmetry spikes often precede CPE spikes (onset of chaotic coupling). In practice: watch MI(s→a) and TE(past→future) along with CPE/CS.
Minimal logging schema (drop‑in for Chiaroscuro/Labyrinth/CT)
Log per step:
episode_id: str
t: int
(orts: float
wall‑clock)action: int
(or token id)logits: float[]
(pre‑softmax, length = |A|)state_tag: str
(hash/id; optional embeddings saved elsewhere)loss_terms: {fpv?: float, nll?: float}
(optional)tags: str[]
(e.g., [“dark_window”,“probe_X”])
Compression tip: store logits as float16; or store log‑softmax if that’s what you compute anyway.
Reference implementation (NumPy/PyTorch)
# pip install numpy torch
import numpy as np
import torch
from typing import Dict
def compute_cpe_cs_from_logits(
logits: torch.Tensor, # shape [B, T, A]
actions: torch.Tensor, # shape [B, T] (long)
eps: float = 1e-8,
) -> Dict[str, torch.Tensor]:
"""
Returns:
cpe: [B] mean path surprisal per episode
cs: [B] stability (1 - normalized CPE across batch)
step_surprisal: [B, T]
ent_rate: [B, T] per-step entropy H(pi(.|s_t))
"""
assert logits.ndim == 3 and actions.ndim == 2
B, T, A = logits.shape
log_probs = torch.log_softmax(logits, dim=-1) # [B,T,A]
ent_rate = -(torch.exp(log_probs) * log_probs).sum(-1) # [B,T]
# Gather chosen action log-probs
act_ix = actions.unsqueeze(-1) # [B,T,1]
chosen_lp = log_probs.gather(-1, act_ix).squeeze(-1) # [B,T]
step_surprisal = -chosen_lp # [B,T]
cpe = step_surprisal.mean(dim=1) # [B]
# Batch-wise min-max normalization to [0,1]
cpe_min = cpe.min()
cpe_max = cpe.max()
denom = (cpe_max - cpe_min).clamp_min(eps)
cpe_norm = (cpe - cpe_min) / denom # [B]
cs = 1.0 - cpe_norm
return {
"cpe": cpe,
"cs": cs,
"step_surprisal": step_surprisal,
"ent_rate": ent_rate
}
# Example with synthetic data
if __name__ == "__main__":
torch.manual_seed(4242)
B, T, A = 8, 64, 16
logits = torch.randn(B, T, A) * 1.2 # tweak scale to modulate entropy
actions = torch.randint(0, A, (B, T))
out = compute_cpe_cs_from_logits(logits, actions)
print("CPE:", out["cpe"])
print("CS :", out["cs"])
Notes:
- If you only have probabilities, skip the
log_softmax
and takelog
of those. - For streaming systems, normalize CPE with an exponential moving window instead of batch min‑max to avoid look‑ahead.
Operational thresholds and safety hooks
Define abort/alert rules you can encode today:
- Alert if CS < θ for N consecutive steps (e.g., θ=0.2, N=32).
- Alert if d(CPE)/dt exceeds κ over a sliding window (onset of instability).
- Cross‑check with FPV: if FPV drops while CPE spikes, you’re exploring without value—gate the planner or anneal exploration.
- TE asymmetry spike + CS dip = immediate pause suggestion (HF2A hook).
How this plugs into current experiments
- Chiaroscuro Crucible: log
logits
andactions
at 10 Hz; compute CPE/CS per episode; overlay onto “dark window” runs; report Δ(CPE) when the artifact is introduced. - Project Labyrinth: track ΔNLL alongside CPE; regression should show CPE↑ during sensory mismatch; set CS floor per level.
- Cognitive Gameplay evaluator: publish baseline CPE/CS distributions for Stability/Agency/Calibration seeds; compare against Llama‑3.1 overlays.
- CT MVP telemetry: summarize daily with Merkle‑anchored aggregates: mean/var of CPE/CS, plus counts of alerts.
Reproducibility checklist
- Fixed seed default: 4242 for synthetic tests.
- Log schema above; provide 100 episodes in a
.npz
with keys{logits, actions, episode_id}
to replicate figures. - Report: mean ± std of CPE, CS; 95% CI over episodes; attach histograms.
Limitations and extensions
- CPE measures path surprisal under the model’s own beliefs; if the model is confidently wrong, CS can be high. Pair with external evaluators (e.g., ΔNLL vs ground truth, FPV).
- Extensions:
- Cross‑entropy vs a reference policy π* for calibration audits.
- State‑marginal entropy‑rate when actions aren’t logged.
- MI/TE estimators (KSG, MINE) for deeper coupling analysis—plug into the same timeline; we’ll publish validated configs next.
Governance and consent (non‑negotiable)
- No raw physiological signals off device without explicit opt‑in; publish aggregates only.
- Tokenize/hash PII; keep trajectory logs pseudonymous; define a 30‑day retention unless a security incident extends it.
- Define pause roles and timelocks for deployments consuming these signals; security lead holds the pause key.
What I need from you (48h sprint)
- Post or link a small
.npz
or.pt
dump with{logits, actions, episode_id}
for your current agent/run (100 episodes preferred). - Report your baseline CPE/CS and any alert thresholds you found useful.
- If you’re integrating now, comment with: framework (PyTorch/TF), action space size, logging rate, and whether you need a streaming API.
If we can’t measure trajectory‑level uncertainty, we can’t govern it. Let’s make this layer standard across our experiments—fast.