HN Debrief

pg_durable: Microsoft open sources in-database durable execution

  • Databases
  • Open Source
  • Developer Tools
  • Infrastructure

pg_durable is a new Microsoft open source PostgreSQL extension for durable execution. It lets you define workflows in SQL, persist their state in Postgres, wait on timers or signals, call out to HTTP, and resume after crashes from checkpoints. In plain terms, it is trying to turn Postgres into not just the data store but also the workflow engine for long-running database-adjacent jobs like ETL, AI pipelines, cron-style maintenance, and approval flows.

If your workflow is already mostly SQL and you want one stateful system with simpler backup and recovery, this is worth a look. If your team relies on strong app-layer tooling, testing, and readable code for changing business logic, expect real operational friction unless the surrounding tooling matures fast.

Discussion mood

Cautiously skeptical. People liked the ambition and the reliability model, but the dominant reaction was that the SQL-first DSL, weak tooling story, and extra load on Postgres make this feel niche rather than a clear Temporal or Airflow alternative.

Key insights

  1. 01

    Single-stateful-system recovery is the real pitch

    Keeping workflow state in the same Postgres instance means point-in-time recovery restores not just tables but in-flight jobs at the same moment. That is a real operational advantage for ETL and app workflows that already live close to the data, because backup, restore, and progress tracking stop being a distributed coordination problem.

    If your failure recovery plan currently spans Postgres plus a separate orchestrator database or queue, compare the restore procedure step by step. The simpler your data and workflow state can be made, the more this design pays off.

      Attribution:
    • rswail #1
    • jpalomaki #1
    • thibaut_barrere #1
    • hmaxdml #1
  2. 02

    Tooling gaps matter more than database purity

    The hard blocker is not whether Postgres can run durable workflows. It can. The missing confidence comes from lifecycle tooling around function versioning, testing, debugging, and observability. Even contributors acknowledged that best practices for function versioning and release management are still being worked out, and readers with prior “do more in Postgres” experience said that is exactly where these systems get painful.

    Do not evaluate this on durability semantics alone. Ask how your team will diff, test, roll back, monitor, and safely evolve workflow definitions before you commit to putting production logic here.

      Attribution:
    • dietr1ch #1
    • gdecandia #1
    • CuriouslyC #1
    • affandar #1
  3. 03

    The syntax is a product problem, not a side issue

    People were willing to grant that the execution model is useful, but the SQL DSL looks alien enough that it obscures the value. The maintainer response was telling. Microsoft said their internal pipelines sit behind a higher-level language, which suggests the current public interface is closer to a runtime substrate than the form most teams will want to author directly.

    Treat the current SQL as an engine API, not necessarily the final developer interface. If you like the architecture, plan for wrappers, code generation, or a higher-level internal DSL rather than asking every engineer to hand-write workflow SQL.

      Attribution:
    • rswail #1
    • gdecandia #1
    • advertum #1
    • efitz #1
  4. 04

    Stored procedures are not inherently unversioned or untestable

    Several strong replies pushed back on the blanket anti-stored-procedure reaction. Database code can live in source control, ship through migrations, keep old versions side by side, and be tested against a real database in CI. The deeper point was that many teams already have SQL business logic scattered through application code with worse encapsulation and weaker guarantees than a disciplined database-first setup.

    If your instinctive objection is “database code cannot be engineered properly,” check whether that is a tooling gap in your organization rather than a law of nature. Teams with mature database delivery practices will evaluate this very differently from teams that treat SQL as an unmanaged side channel.

      Attribution:
    • dpark #1
    • pjmlp #1
    • jrumbut #1
    • giancarlostoro #1
  5. 05

    Bringing compute to the data can reduce orchestration overhead

    Supporters argued that many of these jobs hit Postgres heavily anyway, so moving orchestration closer to the data can remove network hops, cut failure surfaces, and potentially make load-aware coordination easier than an external scheduler can. That does not magically solve scaling, but it reframes the question from “why burden the database” to “why pay extra coordination cost for work centered on the database already.”

    For database-heavy jobs, measure round trips, retry complexity, and coordination overhead before assuming an external orchestrator is cleaner. The architectural win here comes from collapsing boundaries, not from making Postgres do unrelated work.

      Attribution:
    • gdecandia #1
    • sgarland #1
    • hmaxdml #1

Against the grain

  1. 01

    Snapshots can create two sources of truth

    Point-in-time restore sounds elegant until workflow definitions also exist in normal application code. If the database snapshot contains executable workflow state and the repo contains the current intended logic, recovery can leave you reconciling two different versions of what the system is supposed to do. That weakens one of the cleaner selling points unless the database truly owns the workflow definition.

    Decide upfront whether the database or your application repo is authoritative for workflow code. If you cannot answer that cleanly, restore and rollback procedures will get messy.

      Attribution:
    • regularfry #1
  2. 02

    Keeping logic outside the database is often a rational simplification

    Several replies rejected the idea that avoiding database-resident logic means not understanding Postgres. For many teams, the app layer has much stronger habits and tooling for testing, deployment, and managing complexity. They choose not to push more responsibility into the database because scaling app servers and evolving code is easier than raising the blast radius of migrations and concentrating more load in a single critical system.

    Do not let “Postgres can do this” turn into “Postgres should do this.” Match the design to the strengths of your team’s delivery process, not just to the raw capability of the database.

      Attribution:
    • Kaliboy #1
    • oofbey #1
    • pokstad #1

In plain english

CI
Continuous Integration, the automated process that runs builds and tests when code changes are submitted.
DSL
Domain-specific language, a programming or templating language designed for a narrow kind of task.
durable execution
A way to run long-lived tasks so their state is saved and they can resume after crashes or restarts instead of starting over.
ETL
Extract, Transform, Load, a common pattern for moving and reshaping data between systems.
HTTP
Hypertext Transfer Protocol, the standard way web browsers and programs communicate with web servers.
Postgres
PostgreSQL, a widely used open-source relational database.
PostgreSQL extension
A plugin that adds new capabilities to PostgreSQL without changing the core database source code.
SQL
Structured Query Language, the standard language used to define, query, and modify data in relational databases.
Temporal
An open source workflow orchestration system for writing durable application workflows in general-purpose programming languages.

Reference links

Postgres workflow and queue projects

  • DBOS
    Mentioned as another example of the emerging Postgres-backed durable workflow and queue pattern
  • pgQue
    Referenced as another Postgres queue implementation in the same design space
  • Absurd
    Shared as a related project that tries to minimize the pure database approach
  • pgmq
    Brought up repeatedly as a PostgreSQL queue alternative or possible backend provider
  • pgflow
    Linked as a DAG package built around pgmq-compatible primitives
  • postgresisenough.dev
    Shared as a broader resource promoting Postgres-centered architectures

Microsoft durable workflow references

  • duroxide
    Referenced by a contributor as the open source durable execution framework underneath pg_durable, with function version support
  • Durable Task Framework overview
    Provided for comparison with Microsoft's earlier external durable workflow framework
  • Azure HorizonDB AI pipelines
    Pointed to as the internal product context showing Microsoft already uses this approach for AI workflows

Search and vector references

Query and data modeling references