The Coming Loop

AI
Programming
Developer Tools
Management

Ronacher’s post is a field report from someone who builds agent tooling and still does not trust the fully automated future it points toward. He separates normal “agent loop” usage, where a developer steers a model interactively, from a more hands-off “harness loop,” where software keeps prompting models, running tools, checking results, and deciding what to do next. His core claim is that this style is coming whether people like it or not, especially in places like bulk porting, security work, and organizations that care more about throughput than code comprehension. But he also says present-day models still produce code with a recognizable failure mode: too much defensive handling, weak invariants, duplicated logic, local fixes, and abstractions that make the system harder to reason about.

That diagnosis held up for a lot of people. The strongest pattern was that teams using these tools are no longer bottlenecked on raw implementation. They are bottlenecked on knowing what they want. Several experienced users said the outer loop only works once they have a sharp spec, clear success criteria, and tests that tell the model when it is done. In practice that shifts software work upward. More time goes into architecture, task decomposition, PR review, and refreshing specs after implementation details shake out. A few commenters welcomed that as a return to actual engineering discipline. Others pointed out that this is just Brooks’s “no silver bullet” in new clothes. The hard part never was typing CRUD. The comments were much more skeptical about code quality under repeated looping than about the existence of the pattern itself. People described the same pathology from multiple angles: loops optimize for passing the next check, so they accrete guards, fallbacks, special cases, and line count until the code survives tests but loses shape. Python examples like pointless `hasattr` and null checks came up repeatedly. Some said this is training-data baggage from mediocre codebases. Others argued it reflects a deeper limitation. Models do not hold the whole web of assumptions in their heads, so they patch symptoms instead of designing illegal states out of existence. Where people did report success, it was usually in tightly constrained environments. Long-running ports, repetitive SaaS patterns, scaffolded architectures, strict lint rules, behavior-driven tests, red-green workflows, and checklists gave agents enough rails to keep moving. The notable detail is that these users are not trusting model taste. They are replacing taste with process. They create narrow patterns, ratchets, evaluators, and review hooks that force acceptable output. That makes looping less like autonomous programming and more like building a factory around a mediocre but fast implementer. A second thread running through the comments was organizational, not technical. Many people are less worried about the model than about what executives and product teams will do with it. Reviewers described being flooded with giant PRs from developers or non-developers who barely understand the code they are submitting. Several said the exhausting part is not using AI, but acting as quality control for AI-first organizations that reward visible throughput and treat scrutiny as resistance. That produced the dominant mood of the conversation: not anti-AI exactly, but tired, wary, and hostile to the hype cycle around inevitability. Even many regular users of Claude Code or Codex said they find the tools valuable while resenting the management story attached to them. The practical consensus was blunt. Loops are real. They can be useful. They are not magic. If your specs, invariants, tests, and review culture are weak, loops will amplify the weakness. If those things are strong, loops can chew through implementation work and some refactoring at impressive speed. The human role does not disappear. It moves toward problem framing, constraint setting, and saying no when the machine’s version of “done” is not good enough.

If you are adopting agentic coding, invest first in specs, tests, architecture rules, and review process rather than in ever more orchestration. Expect management pressure to confuse output volume with progress, and build explicit quality gates before that becomes your team’s default mode.

June 23, 2026
lucumr.pocoo.org
Discuss on HN

Key insights

Specs became the real bottleneck

Once implementation is cheap, the scarce resource is a precise spec and a reviewer who can tell whether the result is acceptable. People getting strong results from Claude Code said the model can execute well when given an actionable plan, but it stalls the moment tasks depend on unresolved design choices or human review. That reframes AI coding as a shift back toward systems thinking and quality control, not an escape from them.

Put senior time into specification templates, design review, and explicit acceptance criteria. If you cannot express the task crisply enough for a junior engineer, your agent loop will not save you.

Attribution:

stillpointlab #1 #2 #3
mlsu #1
jschveibinz #1

LLMs optimize for survivable slop

The recurring smell was not random bugs but a very specific style of bad code. Models add null checks, `hasattr`, fallback branches, and half-sensible defaults because that pattern helps avoid immediate failure in messy training data. The cost is that impossible states stay possible, hidden errors get normalized, and the codebase becomes harder to reason about after every successful patch.

Strengthen types, invariants, and fail-fast behavior before scaling agent use. Review generated code for hidden fallback logic, not just for whether the tests passed.

Attribution:

boscillator #1
jerf #1
ambicapter #1
skywhopper #1
mmillin #1 #2

Good results come from replacing taste with rails

The most successful reports were not about trusting a smarter model. They were about surrounding the model with architecture constraints, BDD tests, checklists, ratchets, hooks, and staged workflows like red-green commits. That setup can suppress a lot of slop, but it works by constraining the search space so heavily that the model is no longer making many important design decisions on its own.

Treat agent adoption like process engineering. Build narrow, reusable patterns and automated checks for the classes of work you actually want to delegate.

Attribution:

dirtbag__dad #1
furyofantares #1 #2
lifeisstillgood #1

The review burden is turning into organizational debt

Several comments made the same point from inside companies already pushing hard on agents. The pain is not only bad code. It is giant PRs, unclear ownership, and management rewarding output while discounting the effort needed to review, reject, and unwind poor decisions. In that environment, AI becomes a force multiplier for the loudest internal incentives, not for engineering judgment.

Track review time, PR size, rollback rate, and post-merge defects as first-class metrics. If leadership only measures feature throughput or token usage, your process will drift toward slop by default.

Attribution:

CodingJeebus #1
wavemode #1
piva00 #1 #2
skepticATX #1 #2

Judgment-heavy work does not delegate cleanly

A useful dividing line emerged between tasks that are predictable and tasks that are taste-driven. If the work is basically “do X the way we already did Y,” agents can help a lot. If success depends on aesthetic judgment, future changeability, or careful tradeoffs that are hard to spell out, the model can assist with edits but should not own the decisions. That matches the experience of people who use AI for targeted refactors but distrust autonomous code review or architecture work.

Classify tasks before handing them to agents. Use loops for repetitive, well-bounded execution and keep humans tightly involved when the job is to preserve taste, clarity, or long-term maintainability.

Attribution:

CraigJPerry #1
zahlman #1 #2
miki123211 #1

Bad code may still win in some markets

A harder-edged point was that human maintainability may stop mattering for parts of the software market. If a company can ship features faster with machine-maintained code and keep enough automated verification around it, bloated internals may be economically acceptable. Product managers already pushing code with Claude or Codex were cited as early signs of that shift. The quality bar may split by domain rather than rise or fall everywhere at once.

Decide explicitly which systems must remain human-comprehensible and which can be treated as disposable or machine-tended. Do not let that boundary emerge accidentally repository by repository.

Attribution:

illuminator83 #1
Jcampuzano2 #1
noodletheworld #1

Against the grain

Outer loops can reduce human thinking time

Not everyone accepted that understanding must come first. A couple of practitioners said they use models to analyze their own prompt history, derive higher-level guidance, and keep recursive agents improving with limited supervision. The results were mixed and not trusted for delicate codebases, but they reported meaningful speedups on games and long tasks where the agent could learn the working pattern from prior iterations.

If you have repetitive personal workflows, it may be worth mining your own sessions and encoding the pattern into prompts or harness logic. Just keep that experimentation away from high-risk systems until you can measure the failure modes.

Attribution:

dataviz1000 #1
athrowaway3z #1

Software may become more like husbandry

One commenter leaned into Ronacher’s biological metaphor rather than resisting it. The argument is that future software might be managed more like living systems, where operators learn behaviors and feedback loops without understanding internals in a source-level way. That is far from the dominant view here, but it captures a genuine alternative end state to traditional software craftsmanship.

Watch for tools and platforms that prioritize observability, steering, and behavioral control over direct code comprehension. That model will appeal in domains where source-level understanding is already secondary to runtime management.

Attribution:

livingsoft #1

Most of this may still be hype theater

A minority rejected the whole frame as internet panic and subculture performance. Their point was that most developers still use simple chat tools, token economics are distorted by subsidies, and the grand language around loops outruns what is happening in ordinary teams. From that view, the smartest move is to avoid letting AI discourse dictate your emotional state or your roadmap.

Do not reorganize your company around the most maximalist AI workflow you saw online. Validate against your own costs, your own team, and whether the tool improves actual delivery rather than narrative status.

Attribution:

sunir #1
wartywhoa23 #1
wiseowise #1

In plain english

`hasattr` ↩

A Python built-in function that checks whether an object has a named attribute.

agent loop ↩

An interactive workflow where a human repeatedly steers an AI coding agent step by step as it works.

BDD ↩

Behavior-driven development, a style of testing and specification that describes software behavior in human-readable scenarios.

CRUD ↩

Create, read, update, delete, the basic operations used in many standard business applications.

harness loop ↩

A more automated workflow where software repeatedly prompts an AI, runs tools or tests, evaluates the results, and decides the next step with less human involvement.

PR ↩

Pull request, a proposed set of code changes submitted for review before being merged into a shared codebase.

red-green ↩

A test-driven workflow where you first write a failing test, then write code until the test passes.

SaaS ↩

Software as a service, software delivered over the internet as an ongoing hosted product.

spec ↩

Short for specification, a clear written description of what software should do and the constraints it must satisfy.

Reference links

Agent loop tools and examples

agent-tuning
A repository shared as an example of successful work with recursive agents and prompt tuning.
pizx
An example experimental looping setup built on top of Pi and zx.
loop-dev README
A repository offered as one commenter’s own writeup and examples of loop-based development.
rcarmo gist on supervising agents
Shared as a concrete description of how one team runs supervisory and auditing loops over agents.

Methodology and process references

IBM Rational Rose migration docs
Used as a reference point for older heavyweight software methodologies that some think may return in adapted form for AI agents.
Claude Code /goal documentation
Quoted to explain how Claude Code evaluates whether a long-running goal has been achieved.
Tool Response Engineering
Linked by an AI tooling researcher arguing that prompt engineering is giving way to more complex tool and loop design.

Critiques and reflections on AI coding

Nolan Lawson on using AI to write better code more slowly
Shared as a more optimistic framing for using AI in a way that improves understanding rather than bypassing it.
MIT Media Lab publication on ChatGPT and cognition
Cited in an argument that heavy LLM dependence may contribute to cognitive atrophy.
Neuromatch posts criticizing loop-produced code
Referenced as an example of poor code quality from highly loop-driven workflows.
Neuromatch follow-up post
A second example used to support criticism of unmaintainable AI-generated code.

Related Hacker News and blog discussions

Earlier Hacker News discussion on the Ralph Wiggum loop
Mentioned as prior loop discourse that one commenter was glad to have skipped.
Audience of One Numbers
Linked in support of building software for yourself and using AI to make personally meaningful projects possible.

The Coming Loop

Discussion mood

Key insights

Against the grain

In plain english

Reference links

Agent loop tools and examples

Methodology and process references

Critiques and reflections on AI coding

Related Hacker News and blog discussions