😅

Ralph Wiggum Loop — When Agents Repeat the Same Mistake

"I'm in danger" — causes and fixes for agents stuck in loops

You ask a coding agent to fix a bug, and this happens:

Edits code
Runs test → fails
Slightly different edit to same code
Runs test → same error
Another similar edit
... (10 iterations, $3 in tokens, zero progress)

That's the Ralph Wiggum Loop.

Why It Happens

LLMs don't count attempts: Even with failure records piling up in message history, LLMs often can't realize "I've tried the same approach 5 times, I should try something different."

Context window limits: As history grows, reasons for early failures get truncated. The LLM "forgets" past attempts and repeats mistakes.

Insufficient error messages: If test errors only return "assertion failed," the LLM can't identify the cause. Lack of info → same guesses repeated.

Poor tool design: Tools returning only "success/failure" without context. LLM has no material to judge why it failed.

Detection

Repetition detection: If the last N tool calls use the same tool with similar parameters → loop detected → intervene.

Token budget: Set token ceiling per task. If exceeded, have the agent honestly report "I'm having difficulty solving this."

Progress metrics: After each loop iteration, evaluate "what changed?" No change → break the loop.

Prevention Strategies

Explicit in system prompt: "If the same approach fails 2+ times, try a different strategy" "If stuck, ask the user for help"

Failure history summary: Every N iterations, inject a summary of "what I've tried and results" to the LLM. Prevents losing context.

Escalation mechanism: After N failures → report to user: "I'm stuck here. Tried A, B, C but..." and hand over judgment.

Better tool results: Include possible causes, related files, and approaches to try in error messages.

How It Works

1

Add repetition detection to agent loop (compare last N tool calls)

2

Set token/iteration ceiling per task

3

Specify "2 same failures → change strategy" rule in system prompt

4

On loop detection, escalate — report situation to user

Use Cases

When coding agent repeatedly retries the same test error When research agent keeps searching with the same query

References

🔗 Devin's loop detection mechanism