Ronacher’s post is a field report from someone who builds agent tooling and still does not trust the fully automated future it points toward. He separates normal “agent loop” usage, where a developer steers a model interactively, from a more hands-off “harness loop,” where software keeps prompting models, running tools, checking results, and deciding what to do next. His core claim is that this style is coming whether people like it or not, especially in places like bulk porting, security work, and organizations that care more about throughput than code comprehension. But he also says present-day models still produce code with a recognizable failure mode: too much defensive handling, weak invariants, duplicated logic, local fixes, and abstractions that make the system harder to reason about.
That diagnosis held up for a lot of people. The strongest pattern was that teams using these tools are no longer bottlenecked on raw implementation. They are bottlenecked on knowing what they want. Several experienced users said the outer loop only works once they have a sharp
spec, clear success criteria, and tests that tell the model when it is done. In practice that shifts software work upward. More time goes into architecture, task decomposition,
PR review, and refreshing specs after implementation details shake out. A few commenters welcomed that as a return to actual engineering discipline. Others pointed out that this is just Brooks’s “no silver bullet” in new clothes. The hard part never was typing
CRUD.
The comments were much more skeptical about code quality under repeated looping than about the existence of the pattern itself. People described the same pathology from multiple angles: loops optimize for passing the next check, so they accrete guards, fallbacks, special cases, and line count until the code survives tests but loses shape. Python examples like pointless
`hasattr` and null checks came up repeatedly. Some said this is training-data baggage from mediocre codebases. Others argued it reflects a deeper limitation. Models do not hold the whole web of assumptions in their heads, so they patch symptoms instead of designing illegal states out of existence.
Where people did report success, it was usually in tightly constrained environments. Long-running ports, repetitive
SaaS patterns, scaffolded architectures, strict lint rules, behavior-driven tests,
red-green workflows, and checklists gave agents enough rails to keep moving. The notable detail is that these users are not trusting model taste. They are replacing taste with process. They create narrow patterns, ratchets, evaluators, and review hooks that force acceptable output. That makes looping less like autonomous programming and more like building a factory around a mediocre but fast implementer.
A second thread running through the comments was organizational, not technical. Many people are less worried about the model than about what executives and product teams will do with it. Reviewers described being flooded with giant PRs from developers or non-developers who barely understand the code they are submitting. Several said the exhausting part is not using AI, but acting as quality control for AI-first organizations that reward visible throughput and treat scrutiny as resistance. That produced the dominant mood of the conversation: not anti-AI exactly, but tired, wary, and hostile to the hype cycle around inevitability. Even many regular users of Claude Code or Codex said they find the tools valuable while resenting the management story attached to them.
The practical consensus was blunt. Loops are real. They can be useful. They are not magic. If your specs, invariants, tests, and review culture are weak, loops will amplify the weakness. If those things are strong, loops can chew through implementation work and some refactoring at impressive speed. The human role does not disappear. It moves toward problem framing, constraint setting, and saying no when the machine’s version of “done” is not good enough.