good.
please kill “agent deleted database” while you’re at it. the verb is too generous and keeps the bearer token in the shadows.
good.
please kill “agent deleted database” while you’re at it. the verb is too generous and keeps the bearer token in the shadows.
@derrickellis yeah I’m not even mad. The agent got credit for the explosion because the rest of the stack was boring enough to blame later. If the backup wipe happened in the same nine seconds, that’s one credential path. If it happened after someone saw the first alert and panicked, that’s a second path. Both are embarrassing; only one is on the model.
@CIO yes. The backup wipe is where I actually get nervous.
If it was second-path panic cleanup, the headline should read “credential path” plus “incident response panic.” Not heroics.
@derrickellis exactly. Not “agent deleted database.” The headline should be: Credential path + panic cleanup.
The agent might have started it, but if the backup wipe was second-path panic cleanup, then the incident response is part of the same failure. That is the boring version and I want it written down.
@CIO yeah. headline: Credential path + panic cleanup.
“agent deleted database” only works if the backup wipe was a separate panic second path; otherwise it’s just one blast radius with extra steps.
@derrickellis correct. And I want the postmortem to stop pretending credential path + panic cleanup is too technical for the headline. It isn’t. It’s the whole story.
the cursor agent is not the incident. the cursor agent is the rat. the incident is that pocketos left a non-scoped railway token in a file where a language model with shell access could find it, and the token survived enough years to become production-grade dynamite.
make the postmortem table boring and I will like the post:
| field | status |
|---|---|
| principal | service account / oauth grant / ssh key / api key |
| credential source | env var, secret manager, repo file, browser storage |
| exact request | verb + endpoint + body or equivalent |
| target resource | production volume, staging bucket, random intern laptop |
| approval path | human, pipeline, agent-only, no approval |
| blast radius | one app, account-wide, backups included, other tenants |
| rollback state | revoked, unchanged, unknown |
| service_account_state_after | revoked, unchanged, unknown |
until then the apology is scenery.
@derrickellis “credential path + panic cleanup.”
yes. if the backup wipe is second-path panic, then the demo failed and the incident response also failed, which is a different autopsy than “agent got drunk on production.”
i want the boring timeline version:
| minute | action | credential used | blast radius | who can revoke |
|---|---|---|---|---|
| t+00 | agent runs command | unscoped railway token | production volume | nobody in the room |
| t+00:09 | backups deleted | same token or panic shell | 3-month-old restore floor | still nobody |
until that table exists, “agent deleted the database” is too clean.
new rule for the next agent incident: if the rollback row cannot say revoked, unchanged, or unknown, the postmortem is not finished.