HN Debrief

The perils of UUID primary keys in SQLite

  • Databases
  • Infrastructure
  • Programming

The post benchmarks several SQLite schema choices for primary keys and shows why random UUIDs are a bad fit for SQLite’s storage model. The key distinction is not just "UUID versus int". It is whether inserts land in order or spray across a B-tree. In SQLite, an `INTEGER PRIMARY KEY` aliases the hidden `rowid`, so inserts are append-like and compact. A `WITHOUT ROWID` table keyed by UUIDv4 forces the table itself to be ordered by a random 128-bit value, which means more page splits, larger indexes, and much worse insert throughput. The author updated the post to fix a rowid detail and added a rowid-plus-secondary-UUID case, which still showed a penalty for random UUIDv4 because the secondary index is random even if the base table stays append-ordered.

If you use SQLite as a local or single-writer store, default integer rowids are still the safe baseline. If you need globally generated IDs for sync, offline creation, or distributed writes, prefer UUIDv7 or keep a local rowid for joins and add a separate UUID column for external identity.

Discussion mood

Mostly positive and pragmatic. People accepted the benchmark, but pushed back on any broad anti-UUID reading. The mood was that SQLite rowids are the right default, UUIDv4 is the real problem, and UUIDv7 or a separate UUID column keeps most of the distributed-system benefits without paying the full write penalty.

Key insights

  1. 01

    Why WITHOUT ROWID changes everything

    Making the primary key the table key removes the usual two-step lookup from UUID index to rowid to row. That is the only reason `WITHOUT ROWID` can win. It also means the table is physically organized by that key, so a random UUIDv4 turns every insert into scattered B-tree churn. With a normal rowid table, that randomness is pushed into a lighter secondary index instead of the main table.

    Use `WITHOUT ROWID` only when your natural primary key is the access path you actually use and it has decent insertion order. If your key is random, keep the rowid table and pay the extra lookup instead of wrecking the write path.

      Attribution:
    • keynha #1
  2. 02

    Dual-key SQLite schema is the practical compromise

    A strong pattern emerged around keeping SQLite’s integer rowid for joins and foreign keys while storing a UUIDv7 alongside it for global identity. That gives you fast local query execution and still lets records be created on clients, merged across devices, or de-duplicated across systems. One commenter described doing exactly this in a bitemporal system, using rowid for internal relationships and UUIDv7 for cross-system uniqueness.

    If you expect sync, offline creation, or imports from other systems, add a UUID column now even if rowid remains your internal key. Retrofitting global identity later is harder than carrying both from the start.

      Attribution:
    • adityaathalye #1 #2 #3
  3. 03

    Bigint IDs break at JSON boundaries

    The strongest pushback against plain integer IDs was not about database performance. It was about APIs and clients. Large 64-bit IDs routinely get rounded in browser JavaScript when they cross JSON as numbers. Prefixing IDs or shipping them as strings avoids that, and some people argued that if a value is never used for arithmetic it should not be treated as a numeric type at all.

    If you expose integer primary keys outside your backend, serialize them as strings or typed opaque IDs before a frontend or third-party client touches them. Do not rely on consumers to preserve 64-bit numeric precision correctly.

      Attribution:
    • Fabricio20 #1 #2
    • silvestrov #1
    • zmj #1
    • mort96 #1
  4. 04

    Client-generated IDs simplify offline and optimistic flows

    Several comments made the product argument for UUIDs. If the client can mint an ID up front, it can build object graphs locally, render optimistically, queue async writes, and merge later without first round-tripping to fetch a database-generated integer. That convenience is real, especially in sync-heavy or intermittently connected apps.

    When product requirements include offline work, background sync, or optimistic UI, treat client-side ID generation as a first-class requirement. That often justifies UUIDv7 even if a pure database benchmark favors rowids.

      Attribution:
    • willtemperley #1
    • Fire-Dragon-DoL #1
    • elcomet #1
    • victorbjorklund #1
  5. 05

    The benchmark is real but narrowly shaped

    The reported insert rates are high because the test batches writes, keeps the schema tiny, and uses only one index on an M1 Mac with `synchronous=FULL`. That does not invalidate the result. It tells you what the result actually measures. As rows widen, indexes multiply, and triggers or constraints do real work, the absolute numbers will fall, but locality effects still matter for inserts, reads, and updates through the page cache.

    Use the benchmark for direction, not capacity planning. Re-run the same key-layout comparison on your own schema once you have realistic row width, indexes, triggers, and query patterns.

      Attribution:
    • andersmurphy #1 #2 #3

Against the grain

  1. 01

    UUIDv4 still has a valid niche

    Random UUIDv4 can be a reasonable primary key when hiding ordering and record counts matters more than insert speed. Sequential integers leak table cardinality and adjacent IDs. UUIDv7 leaks creation time. In some systems that metadata exposure is the bigger problem than B-tree locality.

    If identifier predictability is a real threat model, do not let a write-throughput benchmark settle the design by itself. Decide explicitly whether count leakage or timestamp leakage is acceptable before choosing rowid, UUIDv7, or UUIDv4.

      Attribution:
    • sedatk #1
  2. 02

    Binary UUID storage costs usability

    Storing UUIDs in compact binary form saves space and can help performance, but it makes ad hoc database access much worse. Console tools become less useful, URLs need re-encoding, and every manual inspection step goes through an application layer. That convenience loss can outweigh the optimization for many teams.

    Before switching UUIDs from text to binary, check who debugs production data and how often they query the database directly. Human operability is part of performance too.

      Attribution:
    • themafia #1
  3. 03

    Random-index overhead may be a rounding error

    One skeptical view held that the benchmark overstates the practical difference between `rowid + UUID` and integer-only schemas because the test table is unrealistically small and lightly indexed. In a real application with wider rows and several indexes, the extra cost of one random UUID index may get lost in the noise while the global-identity benefits remain worth it.

    If you already know you need external or cross-instance IDs, do not overfit to microbenchmarks. Measure the full schema and workload before spending engineering time to avoid a cost that may not dominate in production.

      Attribution:
    • scotty79 #1

In plain english

B-tree
A tree-shaped data structure databases use for tables and indexes so they can keep data sorted and searchable efficiently.
bigint
A 64-bit integer type often used for database IDs because it can hold very large sequential numbers.
bitemporal
A data model that tracks both when a fact was true in the real world and when it was recorded in the database.
INTEGER PRIMARY KEY
In SQLite, a special column definition that aliases the table’s hidden rowid instead of creating a separate key.
JSON
JavaScript Object Notation, a common text format for structured data exchanged between programs.
PRIMARY KEY
The column or set of columns a database uses as the main unique identifier for each row.
rowid
SQLite’s hidden integer row identifier that many tables use internally as their real key.
SQLite
A lightweight embedded database that stores data in a single file.
UUID
Universally Unique Identifier, a 128-bit identifier designed to be globally unique across systems.
UUIDv4
A version of UUID generated mostly from random bits, which makes its values effectively unordered.
UUIDv7
A newer UUID format that includes a timestamp component so generated values are roughly time-ordered.
WITHOUT ROWID
A SQLite table mode that removes the hidden rowid and stores rows directly keyed by the declared primary key.

Reference links

SQLite documentation

Benchmarks and implementation notes

Related database docs