Cleaning up after AI rockstar developers

AI
Programming
Developer Tools
Management

The post compares AI coding agents to the old “rockstar developer” archetype: someone who ships fast, introduces complexity nobody else understands, and leaves the team to clean up the mess. The author’s point is not that AI is useless. It is that code generation collapses the cost of creating software while doing nothing to reduce the cost of understanding, operating, and changing it later. That resonated hard. People kept returning to the same pattern: AI is great at getting a prototype, a UI draft, or glue code on the screen. The trouble starts when that draft quietly becomes production.

Where the comments landed is sharper than the post. Most of the pain is organizational, not mystical. Businesses often do not reward maintainability, and managers routinely prefer “working now” over “easy to change later.” AI just turns that bias into a firehose. Several people said this is not a brand new failure mode. Outsourced code, resume-driven developers, and hurried internal teams have produced the same kind of debt for years. The difference is speed and uniformity. What used to take months of bad decisions can now be produced in days, often by people who do not know what they are looking at and will accept “tests unrelated to my changes are failing” as an answer. A useful distinction emerged between prototyping and engineering. Many commenters are happy to let AI generate throwaway interfaces, internal tools, one-off automations, or tightly scoped code that is easy to verify. They are not happy to trust it with business logic, security-sensitive paths, architecture, or code that will need to evolve under pressure. In practice, the winning pattern looked boring: small scoped changes, explicit constraints, repeated review passes, tests, linters, and humans who actually understand the system. The losing pattern looked familiar too: non-technical stakeholders mistaking a convincing demo for a production-ready system, then handing the resulting pile to engineers to “just clean up.” There was also a deeper framing shift. A lot of people rejected the article’s “craftsmanship” language because it sounds boutique and optional. They preferred durability, engineering standards, and operating discipline. Software is not a handmade shoe. It is closer to designing a factory or a bridge, where the real value is in systems that can be copied cheaply but still have to be reliable, inspectable, and maintainable. That matters because the strongest pro-AI commenters were not saying quality no longer matters. They were saying quality can be recovered if the user has taste, context, and a review loop. Skeptics answered that this only restates the core issue: the scarce resource is still judgment. The overall signal is straightforward. AI coding is already creating real leverage for prototyping, migrations, and cleanup work. It is also creating a fast-growing inventory of software whose authors cannot explain how it works. That leaves experienced engineers with more leverage too, but in a less glamorous role. They are becoming the people who set guardrails, separate disposable code from durable systems, and decide when the cheapest path today becomes the most expensive system in the building.

Treat AI codegen as a prototyping accelerator, not a substitute for engineering discipline. If your org is letting non-technical staff or rushed teams ship agent-generated code without strong review, tests, and architectural guardrails, expect a cleanup market and rising operational risk.

June 9, 2026
codingwithjesse.com
Discuss on HN

Key insights

Prototype, architect, and gardener are different jobs

They split software work into three modes that companies keep collapsing into one. Fast prototypers explore the idea, architects turn it into a system that can survive real usage, and gardeners keep it healthy for years after the exciting launch is over. That framing explains why AI looks magical at first and destructive later. It is strong at the first mode, weak at the third, and businesses often skip paying for the middle step altogether.

Staff explicitly for those phases instead of assuming one team or one tool can cover all three. If AI is accelerating prototypes in your org, create a handoff point where ownership changes from exploration to production engineering.

Attribution:

saalweachter #1 #2
olvy0 #1

Resume-driven complexity is the older version of this problem

The cautionary embedded Linux story makes the article feel less like an AI story and more like a scaling story about unmanaged velocity. One highly productive developer kept shipping visible wins while quietly creating invisible failure modes like copied codebases, missing source of truth, broken releases, no tests, and no documentation. AI changes the speed, not the shape, of that failure. The common cause is leadership rewarding dazzling output while ignoring operability.

Audit your incentives, not just your tooling. If promotions and praise go to feature count while testability, rollout safety, and maintainability stay invisible, AI will amplify the same dysfunction.

Attribution:

dimaaan #1

Vibe-coded apps can be valuable as requirement discovery

Several people made a pragmatic point that half-broken generated software can still be useful because it captures intent that stakeholders struggle to describe in words. A rough app often answers product and UX questions faster than a requirements meeting can. That means the code may be trash while the artifact is still valuable. The expensive part it saves is not implementation. It is specification discovery.

Ask whether a messy AI-built app is acting as software or as a living mockup. If it is the latter, price the rewrite separately and protect the team from pressure to bless the prototype as the finished system.

Attribution:

samrus #1
jerhewet #1
anonzzzies #1

Unsloppifying still depends on human taste

People who are getting good results from AI are not one-shotting features and trusting the answer. They force multiple review passes, ask for dedicated checks on duplication and architecture, and keep iterating until only known tradeoffs remain. Even then, others pointed out the crucial limit: this only works if the human can recognize bad abstractions and wrong tradeoffs. The cleanup prompts are not a replacement for understanding. They are a force multiplier for it.

Judge AI adoption by the reviewer quality you have, not by the demo quality you see. Teams without strong technical taste will not get the same output from the same tools.

Attribution:

nicman23 #1
kaydub #1
hamdingers #1

Tech debt persists because changes carry organizational risk

One useful correction cut through the usual “just have agents refactor it” optimism. Debt is rarely stuck because nobody can think of a cleanup. It is stuck because changing old code creates review cost, QA cost, rollout risk, blame risk, and stakeholder coordination work. AI lowers typing cost. It does not remove the political and operational cost of touching production systems.

Do not model AI-driven cleanup as cheap just because code generation is cheap. Budget for testing, rollout, ownership, and incident risk or the debt will stay right where it is.

Attribution:

SlinkyOnStairs #1
james_marks #1

AI slop has a distinct smell from outsourced slop

A sharp distinction emerged between two common failure modes. Outsourced ticket-by-ticket code tends to be copy-pasted and narrowly optimized for the current ask. AI-heavy code tends to overbuild. It invents modules, abstractions, and solutions to problems the platform already solved. That difference matters because the fix is different too. One needs consolidation. The other needs aggressive deletion and simplification.

Train reviewers to identify the failure mode before proposing a remedy. If the code is overbuilt by AI, start by removing structure rather than adding more.

Attribution:

skydhash #1

The missing artifact is the prompt history

One commenter pointed out that AI-created systems often lose the one thing that would most help future maintainers: the prompts that captured original intent and preferred modification paths. Engineers are left reverse-engineering not just the code, but the conversation that produced it. That makes AI code harder to maintain than it needs to be.

If your team is using coding agents, store prompts, plans, and review notes alongside the code. You need provenance for generated systems the same way you need commit history and design docs for hand-written ones.

Attribution:

sltr #1
pjc50 #1

Against the grain

The cleanup loop will likely be automated too

A more bullish view argued that today’s manual prompt-review-refactor cycle is temporary. The same self-critique steps good users already run by hand can be built into harnesses and eventually hidden behind one higher quality pass. In that framing, current slop is an immature tooling problem, not proof that AI cannot absorb architectural taste over time.

Do not lock your process around current model weaknesses. Keep the human checkpoints, but expect the baseline quality of generated code and auto-refactoring to improve quickly.

Attribution:

rspeele #1
kaydub #1

Throwaway code is often the right economic choice

Some commenters pushed back on the durability-first framing and said plenty of software really is disposable. Internal tools, small automations, and browser-based helpers may never justify traditional engineering effort. If AI makes those tools cheap enough to exist at all, that is a net gain even when the code would be unacceptable in a core product.

Segment your software portfolio by lifespan and blast radius. Apply strict standards to systems of record and customer-critical flows, then be more permissive where verification is easy and replacement is cheap.

Attribution:

Hasz #1
magicalhippo #1

This is mostly the same old technical debt story

A credible minority argued that the article overstates what is novel here. Organizations have always created messes through rewrites, cowboy coders, bad incentives, and management indifference to maintenance. AI increases the rate of debt creation, but the root problem is still weak organizational discipline. Focusing too much on the tool can become a way to avoid confronting leadership failures that predate it.

Use AI as the accelerant in your diagnosis, not the sole cause. If your org could not control architecture before, buying or banning coding agents will not fix the underlying governance problem.

Attribution:

349187 #1
hilariously #1
AndrewKemendo #1

Reference links

Related essays on maintainability and AI code quality

The Terminal Star
Linked as an earlier writeup about the classic high-output developer who leaves a maintenance mess behind.
About that gig fixing vibe code slop
Referenced to support the point that AI pushes teams from design straight to runtime, where fixes are most expensive.

Tools mentioned for repo hygiene

alint GitHub repository
Shared as a repo-wide linting tool meant to keep project structure clean beyond language-specific lint rules.
alint agent-friendly linter page
Linked to show how the tool is being adapted for agentic coding workflows.

Language and terminology references

Jugaad on Wikipedia
Used to name the improvisational, kludgy style some commenters saw being imported into software development.
Belt and braces expression history
Linked during a side discussion about whether an expression sounding like an LLM tic was actually an old idiom.

Products and examples cited in analogies

IKEA LACK side table
Used as a concrete example in the craftsmanship versus mass optimization analogy.

Cleaning up after AI rockstar developers

Discussion mood

Key insights

Against the grain

In plain english

Reference links

Related essays on maintainability and AI code quality

Tools mentioned for repo hygiene

Language and terminology references

Products and examples cited in analogies