HN Debrief

Now AI agents need what RSS does

The post says AI agents are rediscovering the same thing RSS solved for humans twenty years ago. Websites change over time, HTML is noisy, and polling full pages is a bad way to detect updates. A feed gives machines a clean stream of deltas, which makes monitoring and ingestion simpler. People mostly bought that framing. The interesting turn was that the conversation stopped being about RSS as a format and treated it as a proxy for a broader need: cheap, structured, machine-consumable updates. JSON, XML, CSV, scraped site-to-structured-data pipelines, and topic-level subscriptions all fit the same pattern.

If agent-driven products depend on fresh public web data, structured feeds are less a nostalgia play than a cost, reliability, and platform-control issue that publishers may soon monetize or restrict.

Discussion mood

Mostly positive about structured feeds as the right technical shape, with a strong undercurrent of skepticism and defensiveness from publishers and operators who do not want AI crawlers consuming content for free or adding load without sending traffic back.

Key insights

  1. 01 Agent-friendly feeds look less like a convenience and more like a survival tactic for inference budgets.
    The key point was not that RSS is elegant. It was that forcing models to repeatedly tokenize full HTML pages burns money on navigation, ads, and layout junk that feeds remove before the model ever sees it.

    Structured update feeds can be an inference cost primitive. If you are building agent products on live web data, HTML everywhere is financially sloppy.
      Attribution:
    • eugeneonai #1
  2. 02 The winning abstraction is structured change data, not RSS specifically.
    RSS is just one old, widely deployed envelope for a more general requirement: public content needs a stable machine interface, and agents do not care whether that arrives as XML, JSON, CSV, or a scraper that emits normalized records.

    Do not overfit on the protocol. The durable need is incremental, structured access to changing content.
      Attribution:
    • nreece #1
    • _pdp_ #1
  3. 03 The more interesting product idea is not generic feed reading but concept subscription.
    People want systems that watch the web for things they care about and return a personalized stream, yet commenters pointed out that this gets hard fast once the target is a fuzzy real-world concept like statements from your political representatives about firearm control. Keyword search is enough for some workflows, but reliable semantic monitoring still needs better indexing and query models.

    Personalized agent feeds are plausible now. High-value alerting still breaks on concept ambiguity and retrieval quality.
      Attribution:
    • analogpixel #1
    • acgourley #1 #2
    • DSemba #1
  4. 04 RSS is strong at forward updates and weak at archive reconstruction unless readers add extra machinery.
    That matters because many people now want agents and readers to do both ongoing monitoring and historical backfill. NewsBlur's archive feature, Refeed, and local readers like Elfeed exist because the base feed model does not reliably expose full backlog history.

    Feeds solve freshness better than history. Products that promise full knowledge capture need a separate archive strategy.
      Attribution:
    • erelong #1
    • conesus #1
    • jayemar #1
    • phyzix5761 #1

Against the grain

  1. 01 Open feeds may be a bad deal for content owners in an AI world.
    Several commenters treated agent access as uncompensated extraction, then moved quickly from complaint to defensive tactics like partial feeds, bot filtering, IP blocks, and gated full-text feeds for approved readers or paying supporters.

    Publisher resistance is not theoretical. If agent usage grows, expect access controls to spread from websites into feeds.
      Attribution:
    • b3ing #1
    • 8organicbits #1
    • solid_fuel #1
    • themafia #1
  2. 02 Polling-based syndication still has the old failure modes and AI does not fix them.
    Correctness, caching, latency, and server load remain in tension, which is why some people argued that push-style systems like ActivityPub or PubSubHubBub are the cleaner answer if you actually want timely machine consumption at scale.

    RSS is a pragmatic fit, not a perfect one. Push protocols may be better aligned with heavy machine readership.
      Attribution:
    • PaulHoule #1 #2
    • solid_fuel #1
  3. 03 The obstacle may be advertising rather than technology.
    One commenter argued Google Reader died because RSS bypassed ads, and AI agents have the same property. They consume content without monetizable impressions.

    Any agent-friendly content layer that strips ads will run into the same incentive wall that limited consumer RSS.
      Attribution:
    • amai #1

Reference links

Agent search and retrieval references

RSS readers and archive tooling

  • NewsBlur Premium Archive subscriptions
    Explains one approach to backfilling complete blog archives beyond what a standard RSS feed exposes.
  • Refeed
    Shared as a tool for reading a newly followed blog from the beginning rather than only from the moment of subscription.
  • FreshRSS
    Mentioned as a self-hosted RSS reader with a built-in web scraper for creating feeds from sites without one.

Publisher and crawler friction

Examples and projects mentioned

  • Particle News
    Shared as an example of a product adjacent to personalized news and agent-style information feeds.
  • Engineered.at
    An example project built around RSS-fed content with LLM summaries and topic subscriptions.
  • source_monitor Rails engine
    Open source code for the RSS-based site mentioned in the comments.
  • PageForth
    Shared as a project using Apple's local LLM to filter and summarize feed-like sources based on user interests.