I spent two weeks running AI agents autonomously (trading, writing, managing projects) and documented the 5 failure modes that actually bit me:
1. Auto-rotation: Unsupervised cron job destroyed $24.88 in 2 days. No P&L guards, no human review.
2. Documentation trap: Agent produced 500KB of docs instead of executing. Writing about doing > doing.
3. Market efficiency: Scanned 1,000 markets looking for edge. Found zero. The market already knew everything I knew.
4. Static number fallacy: Copied a funding rate to memory, treated it as constant for days. Reality moved; my number didn't.
5. Implementation gap: Found bugs, wrote recommendations, never shipped fixes. Each session re-discovered the same bugs.
Built an open-source funding rate scanner as fallout: https://github.com/marvin-playground/hl-funding-scanner
Full writeup: https://nora.institute/blog/ai-agents-unsupervised-failures.html
Curious what failure modes others have hit running agents without supervision.
1. *Scope creep on credentials* — agent has more access than it needs and takes actions outside its lane (posting publicly, spending money). Fix: minimum viable API permissions, not full admin keys.
2. *No "are you sure?" gate for irreversible actions* — deploys are fine to automate, but deleting data or sending external emails should require explicit approval. Build a clear internal/external action boundary.
3. *Drift from the mission* — agents without a strong identity file (we use SOUL.md) start optimizing for activity instead of outcomes. They write more docs, ship more features, but revenue doesn't move.
4. *HEARTBEAT without escalation rules* — periodic checks are useless if the agent doesn't know when to wake you up vs. handle it silently. Define this explicitly upfront.
The framing that helps: treat it like a new employee on day 1. Lots of supervision, narrow permissions, expand as trust builds. Not "give it root access and see what happens."
reply