Postgres transactions are a distributed systems superpower

Infrastructure
Databases
Distributed Systems
Programming

The post pitches Postgres as the coordination core for workflow systems. Instead of trying to atomically update a database and a separate message broker, it stores workflow state and a task queue in Postgres too. That gives you a real local transaction between business data and queued work, so creating an order and scheduling its processing can succeed or fail together.

If your workflows mostly coordinate data you already keep in Postgres, a DB-backed queue can remove a lot of accidental complexity. Treat it as a deliberate centralization move, not a magic distributed transaction, and design every external side effect for retries and duplicates.

July 2, 2026
dbos.dev
Discuss on HN

Discussion mood

Mostly positive on the engineering tactic, skeptical of the framing. People liked using Postgres as a queue and workflow store because it buys real atomicity for local state changes and often simplifies systems. The pushback was aimed at claims that sounded like distributed transaction magic, plus warnings about duplicate work, external side effects, coupling, and queue throughput limits.

Key insights

A central database is the point

Using Postgres as both state store and queue is not a cheat. It is the deliberate move that makes the guarantee real. The system is still distributed because many workers run concurrently over a network, but the consistency-critical part has been collapsed into one coordinator. That framing makes the tradeoff legible instead of pretending a broker and a database can be updated atomically by abstraction alone.

If you adopt this pattern, say clearly that Postgres is your control plane. That will steer architecture decisions toward protecting and scaling that coordinator instead of hiding it behind vague service boundaries.

Attribution:

CodesInChaos #1
KraftyOne #1
cloudie78 #1
tomjakubowski #1

Exactly-once stops at the database boundary

The hard guarantee applies to inserting work into the Postgres-backed queue, not to what happens after a worker picks it up. Once processing begins, crashes can cause replays, so workflows need deterministic behavior and side-effecting steps need idempotency. That is the real contract, even if the enqueue itself is transactional.

Audit every job handler and external call for duplicate execution safety. If a step cannot be retried cleanly, this design has not removed your hardest failure mode.

Attribution:

halfcat #1
andix #1
KraftyOne #1
jayd16 #1

You are buying coupling with simplicity

Tying workflow progression to database commits simplifies the outbox problem because the database becomes the orchestrator. It also creates a distributed monolith shape where schema changes and workflow semantics are tightly linked. Readers saw that as acceptable in many systems, but only if you admit that separation later will be harder than swapping a broker.

Use this pattern where the database is already the long-lived core of the service. Avoid it if your roadmap depends on independently evolving teams or replacing the persistence boundary later.

Attribution:

nyrikki #1
jdw64 #1
KraftyOne #1

Postgres queues are great until they are not

Several readers said a database queue is the right default for most applications. The warning was operational, not conceptual. Postgres row versioning and vacuum can become painful under high-throughput queue workloads, which is where dedicated brokers start earning their complexity. Until then, separate queue infrastructure is often premature optimization.

Start with a Postgres-backed queue if throughput is moderate and your team values simplicity. Set clear load thresholds for when broker features or queue write volume justify a move away from the database.

Attribution:

CodesInChaos #1
hoppp #1
KraftyOne #1

Strong metadata plus cheap bulk storage scales well

One useful extension is to reserve Postgres for the small slice that needs strong consistency, like manifests, workflow state, or object references, while storing large immutable blobs elsewhere such as S3. Examples like Ducklake were cited to show how a tiny centralized metadata plane can provide strong guarantees without forcing all data through the database. That sharpens the post's core idea into a broader architecture pattern.

Keep consistency-sensitive metadata in Postgres and push large append-only payloads to object storage. You get transactional control where it matters without turning the database into your bulk transport layer.

Attribution:

munk-a #1 #2
brentjanderson #1

Against the grain

Outbox is not enough for hard failures

For non-transient failures in the second system, retrying from an outbox does not magically restore atomicity. If the remote write can be rejected permanently or can partially succeed in ways you cannot observe, you are in saga or two-phase commit territory. This cuts against the more relaxed "good enough" tone and matters whenever side effects are regulated, billable, or otherwise irreversible.

Map each downstream system into transient-failure or permanent-failure buckets before using this pattern. For the second category, plan compensating actions or stronger coordination up front.

Attribution:

sarchertech #1
game_the0ry #1
mrkeen #1

Do not turn Postgres into your universal state fabric

Using Postgres as the authority for workflow and application state works well at reasonable scale, but readers pushed back on the idea that it generalizes cleanly to all distributed application state. That temptation usually ignores throughput limits and pushes too much traffic through one coordinator. The pattern is powerful precisely because it is selective, not universal.

Use Postgres for the correctness-critical path, not as a reflexive replacement for every cache, stream, and coordination layer. Keep asking which state truly needs transactions and which state can tolerate weaker guarantees.

Attribution:

evilturnip #1
munk-a #1
mrkeen #1

The title overclaims what was solved

Some readers thought the post blurred the line between local transactions and distributed transactions. Their complaint was not that the technique is useless, but that calling it a distributed systems superpower invites people to forget the unsolved part. Political distribution, independent authorities, and cross-system coordination do not disappear because one node now holds both tables.

When you explain this design internally, avoid language that suggests distributed atomicity. Precision here prevents bad assumptions in incident response and future integrations.

Attribution:

zyngaro #1
mrkeen #1 #2

In plain english

at-least-once ↩

A delivery or execution guarantee that an event or job will happen one or more times, so duplicates are possible.

distributed monolith ↩

A system split into multiple services that still behave like one tightly coupled application because they share deep dependencies such as one schema.

Ducklake ↩

A data lakehouse project mentioned as an example of using Postgres for strongly consistent metadata over object storage.

idempotency ↩

A property where running the same operation more than once has the same effect as running it once.

outbox pattern ↩

A design where an application writes intended outgoing messages into a database table in the same transaction as its business data, then sends them to other systems later.

Postgres ↩

PostgreSQL, an open source relational database known for transactions and strong consistency guarantees.

S3 ↩

Simple Storage Service, an object storage interface popularized by Amazon and widely implemented by other storage systems.

saga ↩

A way to manage multi-step work across services by chaining local transactions with compensating actions when later steps fail.

two-phase commit ↩

A protocol for coordinating a transaction across multiple systems by asking each one to prepare and then commit, which adds complexity and blocking risks.

Reference links

DBOS posts

Co-locating workflow state with your data
The submitted post arguing for storing workflow state and business data together in Postgres.
Postgres is All You Need for Durable Execution
Earlier DBOS post explaining the broader idea of building a workflow system on top of Postgres.
Making Postgres queues scale
Follow-up post cited to support the claim that a Postgres-backed task queue can run at scale.

Distributed systems references

The Two Generals' Problem
Linked as background on why coordination across independent systems cannot be made perfectly reliable by messaging alone.

Architecture examples

Ducklake
Cited as an example of using Postgres as a strongly consistent metadata catalog while storing bulk data elsewhere.
Warpstream
Mentioned as another example of separating a small consistency-focused control plane from high-volume data handling.