HN Debrief

LLMs are eroding my software engineering career and I don't know what to do

  • AI
  • Programming
  • Developer Tools
  • Careers
  • Economics

The post is a first-person account from a software engineer in fintech who says LLMs have eaten through the parts of the job they spent a decade building up: domain expertise, debugging instinct, and much of the craft of implementation. Their argument is not that models are fully autonomous or flawless. It is that a senior engineer with an agent can now get close enough to specialist output that accumulated domain knowledge stops feeling like a moat, leaving “taste” and review as the last meaningful differentiators. The piece lands as a career anxiety post, but it is also a claim about hiring and org design: teams no longer need the same mix of specialists if agents can flatten the gap.

Use LLMs where they clearly compress toil, but do not confuse faster draft generation with reduced need for judgment, accountability, or deep domain context. If you run teams, optimize for workflows that keep human expertise sharp and make deterministic tooling, tests, and review more central rather than less.

Discussion mood

Uneasy and defensive. Most commenters accept that LLMs are already useful and changing workflows, but they strongly reject the idea that domain expertise, architecture judgment, and accountability have become irrelevant. The anxiety comes less from model capability alone and more from management hype, hiring pressure, and the likelihood that organizations will cut headcount before they understand the risks.

Key insights

  1. 01

    Compliance is judgment, not checklist parsing

    In regulated work, the hard part is not reading a rule and translating it into code. It is negotiating between law, guidance, accepted industry practice, internal risk tolerance, and what a regulator will actually accept. That makes LLM failures here more structural than incidental, because they tend to act like an overeager junior reading the text literally and missing the institutional context that humans resolve through meetings, precedent, and documented signoff.

    Do not treat compliance review as a pure AI classification problem. Build workflows where engineering, legal, and compliance jointly define source-of-truth interpretations and record signoff, because that documentation is what protects you when the model is wrong.

      Attribution:
    • t34t34r43 #1
    • PeterStuer #1 #2
    • scott_w #1
    • bobkb #1
  2. 02

    Error shape matters more than error rate

    Several commenters pushed back on the easy claim that LLMs are fine if they make fewer mistakes than humans. Human review can catch false positives from an AI auditor, but it cannot reveal the true problems the model never flagged unless someone does the full work anyway. More importantly, model errors are distributed differently from human errors. Teams already have instincts and process for spotting human blind spots, but plausible nonsense from an LLM is harder to reason about and easier to miss.

    Evaluate AI assistance by failure mode, not average benchmark wins. In any high-stakes workflow, assume missed defects are the real risk and design audits, tests, and fallbacks around what the model is likely to omit.

      Attribution:
    • csallen #1
    • skillina #1
    • porridgeraisin #1
    • genxy #1
    • Terr_ #1
  3. 03

    The durable pattern is agents plus deterministic tools

    A strong operations-oriented view emerged from people who tried the full agentic path and backed off. Letting an LLM directly do everything is slow, expensive, and hard to trust during incidents. The better pattern is to use agents to generate or glue together deterministic scripts, tests, and deployment tools that humans can also run. That gives you repeatability, binary outputs, and a clean fallback when production is on fire.

    Spend your AI effort on building reusable internal tooling, not just generating one-off code. The teams that win will have agents sitting on top of reliable scripts and checks, not agents improvising core operational behavior from scratch.

      Attribution:
    • alexpotato #1
    • theshrike79 #1
    • enraged_camel #1
  4. 04

    Specs and tests do not eliminate ambiguity

    The idea that you can solve agent reliability by writing exhaustive specs and walling off tests got a sharp rebuttal. Natural language specs are incomplete by default, especially in business and regulatory domains where important constraints are implicit and contested. LLMs fill the gaps with guesses. Tightening the constraints does not make the model smarter. It often just encourages brittle or underhanded behavior that passes the visible checks while missing the intent.

    Treat spec-first agent workflows as a useful discipline, not a correctness proof. If the intent behind the system cannot be made explicit and testable, assume a human still needs to own the ambiguous parts.

      Attribution:
    • franze #1
    • torben-friis #1
    • hedora #1
    • officialchicken #1
  5. 05

    Implicit business context remains the real moat

    One of the better framings split knowledge into what is explicit in code and docs versus what is implicit in business practice, partner relationships, and regulation. LLMs do well on the explicit layer. They are much worse at the implicit layer where the answer to "why does this work this way" lives in old decisions, exceptions, and constraints that are obvious only to people embedded in the business. Attempts to put non-technical domain experts directly in front of the code with an agent reportedly failed badly.

    If you want to stay valuable, get closer to the hidden constraints that shape software rather than the visible code alone. The harder your knowledge is to recover from repo context and docs, the safer your role is for now.

      Attribution:
    • throwaway201606 #1
    • torben-friis #1
    • mikeocool #1
  6. 06

    AI often shifts effort from coding to review debt

    Several engineers described the same pattern: AI makes initial implementation cheap, then hands humans a bigger, noisier review and maintenance problem. PRs arrive faster but contain extra layers, duplicate logic, architecture drift, or code that works while making the system harder to change later. That means velocity can look better on dashboards while actual system understanding gets worse.

    Track review time, rework, and codebase complexity alongside output. If your team is producing more code but understanding less of it, you are borrowing from future delivery rather than accelerating it.

      Attribution:
    • abhgh #1
    • cmiles74 #1 #2
    • Reason077 #1
  7. 07

    Expertise shows up in steering, not typing

    A repeated point was that experienced engineers underestimate how much hidden skill they are using when they work well with agents. They know what to ask, what to challenge, where the model is likely to go wrong, and how to recognize an answer that is technically plausible but operationally stupid. That is why senior users report huge gains while non-engineers and weaker engineers often produce garbage with the same tools.

    Do not frame your value as manual code production alone. Senior leverage now comes from problem framing, model steering, and ruthless evaluation of outputs, which means your hiring and training loops should test those skills explicitly.

      Attribution:
    • PeterStuer #1
    • jrockway #1
    • iandanforth #1
    • enraged_camel #1

Against the grain

  1. 01

    Hallucinations depend heavily on harness and context

    A minority of heavy users said the constant-hallucination narrative no longer matches their day-to-day experience. With strong harnesses, constrained inputs, and domains where the source of truth can be provided directly, they claim frontier models can be highly reliable and sometimes outperform existing tools or compilers on narrow tasks. That does not refute the broader caution, but it does suggest many arguments are really about bad operating conditions rather than model limits alone.

    Before writing off a model, separate failures of the model from failures of setup. If your use case has a clean spec, narrow domain, and auditable source of truth, investing in a better harness may matter more than switching tools.

      Attribution:
    • rfgplk #1 #2
    • solenoid0937 #1
  2. 02

    Some engineers prefer the higher abstraction layer

    Not everyone sees AI-assisted development as the death of craft. Some said ordinary software jobs had already become assembly-line work long before LLMs, and that agents remove the least satisfying parts like boilerplate, syntax recall, and rote plumbing. For them, the shift feels like moving up the stack toward intent, design, and language rather than losing something sacred.

    If your team is split on AI, do not assume opposition is purely about quality or support. Different engineers value different parts of the work, so role design and retention will increasingly hinge on whether people get to keep the parts they actually enjoy.

      Attribution:
    • hax0ron3 #1
    • onlyrealcuzzo #1
    • awill88 #1
  3. 03

    This may be another painful but familiar tooling shift

    Some veteran commenters argued that software has always forced people to watch valuable skills age out. COBOL, client-server, web, mobile, cloud, test dogmas, framework fads. The field has repeatedly rewarded people who adapt to new abstractions rather than attaching identity to a single layer of the stack. From that angle, AI is harsher and faster, but not fundamentally different in kind.

    Do not overfit your response to the latest tool while ignoring the longer pattern. The safest personal strategy is still the old one: keep learning, stay flexible, and avoid defining yourself by one implementation layer.

      Attribution:
    • ddingus #1
    • Verdex #1
    • SoftTalker #1

In plain english

FinTech
Financial technology, companies and software that provide banking, payments, investing, or other financial services.
LLM
Large language model, a machine learning system trained on large amounts of text that can generate and analyze language and code.
PR
Pull request, a proposed code change submitted for review before being merged into a shared codebase.

Reference links

Incidents and postmortems

Workflow and engineering practice

Career and industry analysis

Technology forecasts and history

Compliance and legal examples

Books and cultural references

Music and creative AI examples