Automating myself out of development

AI
Programming
Developer Tools
Open Source

The post lays out a staged workflow for AI-assisted software development that moves from simple autocomplete and chat help toward a more agentic setup. Specs get written up front, tasks are split across worktrees or sandboxes, agents implement in parallel, and the human shifts from writing code to planning, steering, and approving pull requests. The pitch is not that AI writes perfect software on its own. It is that enough process can let one developer supervise far more code generation than they could produce directly.

Treat coding agents as accelerators for narrow, testable work, not as autonomous engineers. If you are pushing this inside a team, expect review capacity and architectural oversight to become the real bottlenecks long before raw code generation does.

June 13, 2026
thoughtfultechnologist.com
Discuss on HN

Key insights

Mob programming beats autonomous agents

Treating models like supervised participants with explicit roles produced the most credible workflow. The useful pattern was not "give it the repo and walk away". It was active steering, frequent course correction, and using the model as a fast implementer inside a human-owned plan.

Design your process around checkpoints, not autonomy. Split work so a human can inspect each change while the model still has enough context to be useful.

Attribution:

germanptr #1
zem #1

Micro-level speedups can hide macro-level mess

A concrete .docx example showed the trade clearly. Claude quickly handled ugly XML plumbing and visual alignment, then even found and fixed an accidentally quadratic loop, but it still missed the cleaner architectural move of reusing an existing template-and-patching approach. That last step needed a human who understood the whole system and could recognize duplication.

Use LLMs to blast through tedious implementation, then budget time for architectural cleanup immediately after. Do not let a stack of generated commits accumulate before you inspect whether the code fits the system you already have.

Attribution:

zem #1

Mechanical migrations are the sweet spot

A Relay migration worked because the task already had a hand-written reference, strong validation, and mostly one-to-one transformations. That is the pattern where agentic coding feels strong. The model is not discovering the solution. It is applying a known one at scale.

Look first for migrations, rote rewrites, and patterned feature work. If you cannot provide examples and a verification harness, you are outside the highest-yield use case.

Attribution:

girvo #1
zeroonetwothree #1

Code review becomes the limiting factor

The bottleneck shifted from producing code to safely accepting it. Reviewing multi-thousand-line AI diffs drained people, even when they had planned the work themselves. The old lesson still applies: smaller changes are easier to reason about, and pair-style supervision scales better than giant autonomous bursts.

Optimize for reviewability, not maximum token output. If your agent workflow routinely produces large diffs, break the pipeline before scaling it further.

Attribution:

gnunicorn #1
philbo #1
nisabek #1

High output only works with heavy guardrails

The strongest pro-AI anecdote came with a lot of invisible process. Shipping 150k plus lines and 10k pull requests depended on isolated environments, protected releases, semi-manual review, and hundreds of Playwright tests. The speed claim was real, but so was the operational discipline underneath it.

Do not copy the headline productivity numbers without copying the safety system. If you want agent throughput, invest first in test coverage, environment isolation, and release gates.

Attribution:

motoroco #1

Non-programmers can now build, but maintenance is unresolved

Several examples showed people without formal software backgrounds building ambitious products with Claude. That lowers the barrier to getting something working. It does not answer what happens when requirements shift, bugs emerge in interaction effects, or the original builder cannot judge whether a change quietly broke the system.

If AI is expanding who can create software in your company, pair that with explicit ownership for maintenance. Prototype access has widened faster than long-term operability has improved.

Attribution:

properbrew #1
andai #1
dmortin #1

Against the grain

Some users are comfortable loosening control

A minority said the post was still too cautious. They run agents directly on a full-time dev machine, let models manage ticket state, classify parallel work, and spawn subagents with only light human oversight. For side projects and familiar domains, they claimed this already works well enough that strict approval of every step feels like overkill.

If you experiment with higher autonomy, start on low-stakes projects and measure rollback cost. The key question is not whether it can run unattended for a while, but how expensive recovery becomes when it drifts.

Attribution:

2001zhaozhao #1
yieldcrv #1 #2

IP fears may not block hosted models

The claim that serious teams will avoid OpenAI or Anthropic and move to local models got pushed back as a repeat of early anti-cloud arguments. Plenty of companies will still accept the tradeoff if the productivity gain is large enough, which means hosted providers can keep winning even if some sensitive workloads stay local.

Do not build strategy on the assumption that privacy concerns alone will force the market on-prem. Plan for a split world where regulated or proprietary work stays local while mainstream development keeps using hosted services.

Attribution:

duggan #1

Verification can still be easier than creation

The Sudoku analogy captured a narrower but valid counterpoint to the review pessimism. When requirements are crisp and constraints are testable, checking an answer can be much faster than deriving it from scratch. That is why LLMs sometimes feel genuinely efficient on bounded coding tasks.

Be explicit about whether your task is exploratory or constrained before deciding how to use AI. Fast review is realistic when correctness can be checked with tight tests and obvious acceptance criteria.

Attribution:

minihat #1

In plain english

CAN ↩

Controller Area Network, a communication bus used by embedded systems such as vehicles and industrial devices.

end-to-end tests ↩

Automated tests that exercise a full application flow from the user interface through the backend.

Playwright ↩

A browser automation tool used to script and test web applications.

Relay ↩

An ATProto infrastructure service that collects event streams from many PDS servers and rebroadcasts them so apps do not each need to connect to every host directly.

Reference links

Workflow and process writeups

Mob programming for one
Longer explanation of a supervised multi-agent coding workflow where the human stays in the loop.
Agentic coding and mental models
Argument for applying small-review and pair-programming lessons to coding agents instead of maximizing autonomous output.

Architecture references

Why taming architectural complexity is paramount
Reference offered for measuring architectural complexity, which commenters said is still poorly captured by tooling.
Krazaam talk on microservices
Linked as a critique of treating microservices as a silver bullet for architecture problems.

Examples and demos

Buildermark
Example product claimed to be 94 percent written by coding agents.
Buildermark prompt and commit log
Public log of prompts and commits for the Buildermark example.
Whistle Enterprise
Example application a commenter said they built from scratch using LLMs.
PlotAlong
Project cited as evidence of large output with agentic workflows plus strong testing and release gates.