I mostly stopped using Ralph

January 15, 2026

I open sourced a Ralph boilerplate, iterated on it for some time, and now I barely use the autonomous part. While different modes are useful for different cases, I don’t find Ralph loop useful very often.

If you’re not familiar: Geoffrey Huntley coined the Ralph Wiggum Technique while :; do claude ; done. Claude runs, finishes, gets relaunched with a fresh context. No state between iterations. Bash does the orchestration, Claude does the work.

The official CC plugin misses the point

Claude Code has an official Ralph plugin. You invoke it with /ralph-loop "your prompt" --max-iterations 10. It looks simpler. But check the stop hook source:

jq -n \
    --arg prompt "$PROMPT_TEXT" \
    --arg msg "$SYSTEM_MSG" \
    '{
        "decision": "block",
        "reason": $prompt,
        "systemMessage": $msg
    }'

"decision": "block" prevents Claude from exiting. "reason" re-injects the prompt. This all happens inside the same session. Context accumulates. Claude remembers what it did in iteration 3 when it’s working on iteration 7, including all the bad decisions it made.

This is context pollution. A bloated context doesn’t just cost more tokens, it degrades output quality. The more noise in the window, the worse Claude gets at following instructions. You’re paying more for worse results.

If Claude made a bad assumption in iteration 2, the plugin carries it forward. The bash loop doesn’t, each iteration only sees what’s on disk. The tradeoff is that the bash loop is stateless. Claude can’t remember decisions unless it wrote them to a file. That’s why progress.txt exists, it’s the explicit, auditable state.

My setup

At first, my naive boilerplate was only a single script, but I ended up with two.

Autonomous: `ralph.sh`

Give it a number of iterations, walk away. Each one spawns claude -p with the PRD and progress file, plus non-interactive rules baked into the prompt.

One task per iteration. Claude picks the highest-priority incomplete item, implements it, commits, exits. Fresh context next time. This prevents the drift that happens when Claude tries to juggle five tasks at once.

BLOCKED instead of guessing. If a task is ambiguous, Claude marks it BLOCKED and moves on instead of building the wrong thing for three iterations before you notice.

<promise>COMPLETE</promise> as exit signal. When nothing’s left, the loop stops.

The only thing that differs from other ralph scripts you see here, is the gate in the prompt not to ask any questions and explicitly mentioning that this is a non-interactive session.

Interactive: `run.sh`

Two Claude sessions (Sonnet and Opus), back to back. Both using PRD.md and progress.txt as well. The first (Sonnet) runs /run, a custom slash command that reads the PRD, picks a task, and implements it. It can ask questions, use skills, dispatch subagents. The second runs /review with Opus, it looks at unpushed commits, reviews for bugs and code smells, fixes what it finds, and pushes.

The implement-then-review split matters. When the same session writes and reviews its own code, it’s blind to its own mistakes, and tries really hard to find something, anything. A separate session with a fresh context (and a stronger model) catches things the first one missed.

/run wires in everything I’ve built up over months, skills for brainstorming, planning, test-driven development, debugging, visual verification via Playwright MCP. In reality, it’s not my invention. From what I can see, that’s the industry standard these days as superpowers became wildly popular.

Both sessions run with --permission-mode acceptEdits so they don’t block on every file write.

How Ralph actually dies

The loop itself never crashes. Everything around it does.

Playwright and the wrong DOM. Claude writes a test, the selector doesn’t match, the test fails, Claude “fixes” the selector, it still doesn’t match, because the DOM it’s seeing isn’t the DOM that exists (hallucinated). This burns through iterations with zero progress and there’s no way to oversee it, as it all runs in background, there’s no live inspection. Save Playwright for interactive mode where you can steer.

Permission hangs. Claude spawns a background agent that needs permission for a bash command or a file outside allowed paths or a fetch to new domain. No human to approve it. The iteration hangs forever. --permission-mode acceptEdits helps but doesn’t cover everything. There’s no timeout, no fallback, it just waits. And allowing it to fetch anything it wants is slightly dangerous.

Silent credit exhaustion. You come back expecting 10 commits and find the loop frozen mid-iteration because your API credits ran out. Claude doesn’t error. It hangs. No signal to catch. Just check /usage and come back later.

Why I mostly stopped

Once I run ./ralph.sh 50 and went to bed. I was not happy with the result and it wasn’t obvious what actually has to be fixed. If there’s a small drift with each commit, it compounds.

Me running Ralph overnight (The Simpsons, "Last Exit to Springfield")

Here’s the thing nobody in the Ralph hype cycle talks about: plan mode is the single most valuable feature of Claude Code for complex tasks and planning with superpowers gives the best results, in my experience. Yet, it’s incompatible with autonomous mode. Plan mode requires human approval. Ralph is non-interactive. So the very thing that makes Claude Code good at hard work can’t run without a human in the loop.

Without a strict verification pipeline, Ralph produces code that compiles and passes checks but misses the point. As someone on HN put it: “validation is the bottleneck. You need to make sure the functionality matches intent rather than just being ‘functional.’”

The deeper problem is that catching these failures requires experience. The tests pass because they test what Claude built, not what you needed. The same tool that needs oversight is being trusted to verify its own work.

Autonomous mode works for dependency updates, simple bootstrapping and tight specs where there’s nothing to interpret. That’s a small slice of real work. The moment a task requires judgment about architecture or domain logic, you want a human in the loop. Anyone telling you Ralph can autonomously build production software for $10/hour is selling something or rides the hype train.

I mostly use run.sh now. Interactive mode with /run and /review lets Claude plan, ask questions, and use the full skill pipeline.

Why bash

The whole thing, both modes, both slash commands, is about 50 lines of bash and 30 lines of markdown. You can read it in five minutes. You can change anything. You don’t need a gigantic repo for that, like https://github.com/frankbria/ralph-claude-code .

Ralph is a while loop. The technique is trivial, and there are dozens of similar approaches that produce similar results. The weird name and an initial group of evangelists turned it into a meme, and the meme took on a life far beyond the substance. That’s fine, memes are how ideas spread, but don’t mistake the hype for depth.

Build your own ralph and see if it works for you. Mine is at tomwojcik/claude-code-ralph-boilerplate if you want a starting point, but treat it as a dotfile. Everyone has their own.