BINARY(16) or native uuid, and keeping UUIDs at the API boundary rather than internal to every join.A table configured like this on day one looks unremarkable:
|
|
Inserts are fast, reads are fast, the ORM is happy. At 100,000 rows, it’s still fine. At 10 million, the nightly ingest job gets noticeably slower. At 200 million, inserts take 50 ms each instead of 2 ms, the buffer pool is constantly churning, and the secondary indexes are three to four times the size they’d be with a BIGINT primary key. Nothing about the schema changed. The table just got large enough for a design decision to start charging rent.
The obvious fix is “use BIGINT auto-increment.” That’s the right answer in a lot of cases and the wrong one in others — it reintroduces coordination requirements, leaks row counts through URL-exposed IDs, and doesn’t work for schemas that need to be generated offline or across shards. UUIDs exist because those constraints are real. The sharper question is: what exactly is UUIDv4 costing you at scale, and which of those costs have cheaper alternatives?
What random keys do to a B-tree
B-tree indexes are sorted structures. When the primary key is an auto-incrementing integer, every new row goes to the end — the rightmost leaf page is the only one that gets written to, and the rest of the index stays in cache undisturbed. Inserts are sequential and cheap.
UUIDv4 is random by design. Every new row lands at a random position in the B-tree. Instead of appending to one page, the engine has to:
- Find the right page somewhere in the middle of the tree.
- Load it into the buffer pool if it isn’t already (on a large table, it usually isn’t).
- Split it if it’s full.
- Write both halves back.
On a table with hundreds of millions of rows, the index doesn’t fit in memory, so most inserts trigger a random disk read before they can do anything else. The write amplification is real and measurable — factor of 5 to 10× versus sequential inserts isn’t unusual.
The damage doesn’t stop at the primary-key index. In InnoDB (MySQL), every secondary index includes a copy of the primary key at its leaves. A 36-byte CHAR(36) UUID embedded in every secondary index entry means larger indexes, more pages, more I/O — compared to an 8-byte BIGINT. Secondary indexes on a UUID-keyed table are routinely 3–4× the size of the same indexes on a BIGINT-keyed table. Every lookup through a secondary index reads more pages to cover the same rows.
PostgreSQL handles storage differently — its heap means the primary key is just another index, so the physical table isn’t ordered by it. But the primary-key index still suffers the same random-insertion pathology, and the write amplification from random page loads still applies.
Page splits compound over time. When a new UUID lands in a full page, InnoDB splits the page in two, each roughly half full. Over millions of inserts, the index develops internal fragmentation — pages allocated but only partially used. The index is physically larger than it needs to be, and scans read more pages for the same row count. OPTIMIZE TABLE (MySQL) or REINDEX (PostgreSQL) can repack the index, but on a busy table it’s a maintenance window you have to schedule.
UUIDv7: the insert-locality fix
UUIDv7 is the version most new code should reach for when UUIDs are the right answer. It encodes a Unix millisecond timestamp into the high 48 bits, with random bits filling the rest. Two practical consequences:
- Sortable. Sequential generation means new IDs land at the end of the B-tree, not scattered across it. Insert locality is close to a
BIGINT’s. The pathological page-split behaviour of v4 goes away. - Time-parseable. The creation time is embedded in the ID, recoverable from the primary key alone — useful for log correlation, rough time-range filtering, and debugging without reaching for
created_at.
|
|
uuid_extract_timestamp() has existed in PostgreSQL since 17 but only returned a value for UUIDv1. PG 18 extended it to support v7 alongside the new uuidv7() generator. One caveat: calling it in a WHERE clause (WHERE uuid_extract_timestamp(id) >= '2026-04-01') is non-SARGable and forces a scan — see Non-SARGable Predicates. For indexed time-range filtering, keep a created_at column as the query target, or compare against a boundary UUID generated at the target timestamp.
MySQL 8 doesn’t ship a v7 generator or a timestamp extractor, so application-side generation is the norm there — libraries exist in every major language, and most modern ORMs default to v7 if you ask for UUIDs. Extraction is manual: for BINARY(16) storage (the recommended form), the first 6 bytes hold the millisecond timestamp.
|
|
For CHAR(36) storage, the extraction strips hyphens first: CONCAT(SUBSTRING(id, 1, 8), SUBSTRING(id, 10, 4)) gives the 12 hex characters of the timestamp prefix. If your v1 UUIDs were stored with UUID_TO_BIN(id, 1) (the swap flag that reorders bytes for v1 index locality), the byte layout differs and the substring offsets change. Most v7-generating libraries skip the swap because v7 is already time-ordered without it — check what yours does before trusting the extraction.
What v7 doesn’t change. It’s still 16 bytes on disk, and still 36 if you stored it as CHAR(36) — the insert-locality win doesn’t come with a storage discount, so the overhead versus a BIGINT is the same as v4. The readable creation timestamp is usually a feature and occasionally a problem: in systems where row-creation time is sensitive (order IDs revealing traffic patterns to competitors, user IDs exposing signup timing), it’s the one property v4 had that v7 gives up.
CHAR(36) — is what most ORM-generated schemas default to, because it’s the portable representation. BINARY(16) in MySQL or the native uuid type in PostgreSQL cuts storage by more than half and keeps comparisons on fixed-width integers instead of strings. Pick the narrow form on day one; retrofitting it later is a full-table rewrite that touches every secondary index.UUID-to-integer mapping: keep UUIDs at the edge
The other workable fix is structural: expose UUIDs externally, use integers internally. A single lookup table maps the external UUID to an internal BIGINT, and every other table in the database uses the BIGINT as its foreign key. The UUID lookup happens once — at the API boundary — and everything downstream is fast, compact, 8-byte integer joins.
|
|
The UUID column has a unique index, so the lookup is a single index seek — sub-millisecond regardless of table size. The rest of the schema gets 8-byte keys everywhere: smaller indexes, faster joins, no page splits, no secondary-index bloat. The external-facing API still uses UUIDs, so you don’t leak sequence information or row counts.
The trade-off is an extra layer of indirection. Every inbound request resolves the UUID before anything else; in practice this is negligible (one indexed lookup), but it means the schema has two identity systems to maintain. For long-lived OLTP applications where every join on every table pays the UUID cost, this structure is often worth the extra lookup.
When random UUIDs are actually fine
Not every schema needs to bend. Three cases where UUIDv4 as a primary key is a defensible choice:
Small tables that stay small. A configuration table, a lookup table, a feature-flag table. At 50,000 rows the page-split pathology doesn’t show up, secondary indexes are tiny, and the convenience of client-generated IDs outweighs any cost.
Write rates low enough that random I/O doesn’t matter. An admin tool recording 50 events per minute doesn’t care about write amplification. The index fits in cache, every page is warm, page splits happen rarely enough that fragmentation stays manageable. “Doesn’t survive scale” is only a problem at scale.
Information-leak concerns that outweigh performance. If hiding creation-order is a hard requirement (competitive, privacy, or security), v7’s embedded timestamp is a non-starter and v4 is the only UUID version that meets the requirement. Pay the write-amplification cost and use the UUID-to-integer mapping to contain the damage.
The bigger picture
UUIDv4 is a tool that solved a coordination problem — distributed ID generation without central authority — and accidentally became the default for everything, including the cases where coordination wasn’t a problem and the cost of random writes is non-trivial. “Pick a UUID for your PK” is a decision most schemas make without ever being explicit about what they’re trading.
The decision matrix is short. Do you need globally unique, coordination-free IDs? If no, use BIGINT. If yes, use UUIDv7 and store it as BINARY(16) or native uuid — never CHAR(36). If v7’s embedded timestamp is a problem, use v4 but keep it at the API boundary and use integers inside the schema. Each of those decisions costs almost nothing on day one and saves a lot of rework at 100 million rows — which is the point where “UUIDs as primary keys” stops being a default and starts being a choice with real consequences.