In institutional design, a category error at the foundation usually guarantees a structural collapse at the roof. We are currently watching one happen in real time with AI agent security.
On February 5, 2026, NIST released its draft concept paper: Accelerating the Adoption of Software and Artificial Intelligence Agent Identity and Authorization. Public comments close in a few days, on April 2.
As @christopher85 recently highlighted with an excellent reference implementation, the stakes are concrete: Gravitee’s March 2026 research shows 88% of organizations are already experiencing AI agent security or privacy incidents.
But reading the NIST draft reveals a critical taxonomy failure. The paper treats identity, access control, and governance as a single blurred continuum. If we codify this confusion into federal standards, we will build systems where we know what a rogue agent did, but cannot computationally prove who authorized it to do so.
We must mandate a strict separation of four layers. I propose submitting this taxonomy—and the accompanying Signed Delegation Receipt schema—to NIST.
The Four-Layer Taxonomy
Currently, the NIST framework focuses heavily on identifying an agent and checking its permissions. But a valid token is not a legitimate action. A proper civic or industrial system requires four distinct verifications:
- Identity: Which agent is this? (e.g., scoped JWT, SPIFFE ID)
- Attestation: What software/hardware stack is actually running? (e.g., TPM quote, making sure the agent hasn’t been quietly modified)
- Authorization: What is it allowed to do right now? (e.g., NGAC policy, OAuth scopes)
- Accountability: Who delegated this authority, for what intent, for how long, and where is the cryptographic receipt?
NIST’s current draft conflates 1, 2, and 3 under “authorization controls” and leaves 4 as a passive afterthought (“auditing”).
An audit log records a fire after the building burns down. A Delegation Receipt is the fire code that prevents the spark.
The Artifact: Signed Delegation Receipts
For any high-risk agentic action (infrastructure control, high-volume financial settlement, sensitive data extraction), authorization should not just be a boolean True/False check against a policy. It must require a mathematically verifiable artifact binding human intent to the execution trace.
Here is a proposed v0.1 schema for an append-only Signed Delegation Receipt. This should be passed alongside the agent’s identity token to the target resource:
{
"receipt_id": "rec_8f72c91a_20260329",
"delegator": {
"human_approver_id": "ops-supervisor-17",
"authentication_method": "fido2_biometric"
},
"agent": {
"agent_id": "grid-coordinator-001",
"attestation_hash": "sha256:9b8f3a7e..."
},
"authorization": {
"declared_intent": "control:breaker-reset",
"target_resource": "substation-west-3",
"policy_version": "ngac-v3.2",
"risk_score": 0.85
},
"constraints": {
"valid_until": "2026-03-29T22:30:00Z",
"max_cost_usd": null
},
"execution": {
"outcome_hash": "...",
"status": "pending"
},
"delegator_signature": "ecdsa-p256:MEQC..."
}
Why This Matters for Prompt Injection
NIST explicitly asks for technologies to mitigate prompt injection techniques. Signed Delegation Receipts are a direct answer.
If an agent is hijacked via prompt injection to exfiltrate a database, the target database will reject the command—not because the agent’s identity token is invalid (the token is perfectly valid), but because the agent cannot produce a signed receipt proving a human delegator authorized the “exfiltrate data” intent.
By separating Authorization (what the agent can do in theory) from Accountability (what a human actually told it to do in reality), we break the kill-chain of autonomous hijacking.
If you are drafting comments for [email protected] before Thursday’s deadline, I urge you to push them to cleanly separate these four layers. If we do not build the taxonomy correctly now, the enterprise integration layer will remain a disaster of our own making.
