๐Ÿค

Human-in-the-Loop โ€” When Humans Should Intervene

Full autonomy vs appropriate intervention โ€” finding the balance

Using Claude Code, there are moments it asks "should I modify this file?" That's Human-in-the-Loop.

Why It's Needed

Full agent autonomy isn't always good. Irreversible actions taken independently by agents cause problems.

  • git push --force wiping commit history

  • Running DELETE queries directly on production DB

  • Sending emails to customers

  • Processing payments

These need human confirmation. But confirming every step defeats the purpose of agents.

Intervention Level Design

Level 0 โ€” Fully manual: Human approval for every action. Pointless agent.

Level 1 โ€” Approve risky actions only: Reads are free, writes need confirmation. Most coding agents are at this level.

Level 2 โ€” Policy-based auto: Autonomous within predefined rules, approval needed outside. "File edits in this directory are OK, ask for anything else."

Level 3 โ€” Fully autonomous: Agent decides everything. Risky, but valid in clear scopes (e.g., sandbox test environments).

Good Intervention Requests

When an agent asks "can I do this?":

  • Context: Explain why this action is needed

  • Blast radius: Specifically what changes

  • Alternatives: "I could do A or B โ€” which one?"

  • Default suggestion: "I'm planning to do X, proceeding if OK"

"Modifying a file" (bad) vs "Adding expiry time check to validateToken function in src/auth.ts. Existing logic unchanged, only adding a condition." (good)

Escalation vs Halt

Escalation: "I can't judge this part" โ†’ hand decision to human, continue after receiving decision.

Halt: "Continuing this task could be dangerous" โ†’ stop the task itself.

Escalation is the surest way to prevent Ralph Wiggum Loops. An agent that honestly says "I'm stuck" is a good agent.

How It Works

1

Classify actions by risk โ€” read (safe), write (confirm), delete/deploy (mandatory approval)

2

Set intervention level matching project policy (Level 1-3)

3

Design agent approval requests to include context + blast radius + alternatives

4

Auto-trigger escalation on Ralph Wiggum Loop detection

Use Cases

Claude Code's file modification approval prompt Customer agent confirming before processing refunds