Database Deadlocks, Part 1: The Patterns

TL;DR

A deadlock is two transactions each holding a lock the other needs, caught in a cycle the engine breaks by killing one. The patterns are finite and repeatable: inconsistent lock ordering across workers, InnoDB gap locks under REPEATABLE READ, foreign-key shared locks on hot parent rows, unique-index conflicts, index-scan lock amplification, and parallelism patterns that only surface on replicas or under worker-pool load. This post is the patterns. Part 2 covers diagnosis, retry architecture, and prevention.

Deadlocks occupy a strange place in production operations. They’re rare enough that most engineers haven’t thought hard about them, frequent enough in high-concurrency workloads to show up as paging incidents, and subtle enough that the first instinct (“just retry”) is right often enough to keep the root cause hidden. The transaction that got killed was syntactically perfect. The one that survived was too. The bug wasn’t in either statement; it was in the order the two transactions touched rows.

That makes deadlocks harder to reason about than most database failures. The query text in the error log isn’t wrong. The lock it was waiting for isn’t held by a misbehaving process. The system is doing exactly what concurrency control says it should. The failure mode is the interaction between transactions, and those interactions are almost never visible from any single query.

This post covers the patterns: the shapes deadlocks take and why each one exists. The companion post covers reading the deadlock log end-to-end, retry architecture, hot-row isolation, SERIALIZABLE’s serialization failures, DDL migration windows, and NOWAIT / SKIP LOCKED as prevention primitives.

What a deadlock actually is

A deadlock is a cycle in the wait-for graph. Transaction A holds lock L1 and needs L2; transaction B holds L2 and needs L1. Neither can proceed. The only way out is to kill one: pick a victim, roll it back, release its locks, and let the other complete. Every modern relational database does this automatically, usually within hundreds of milliseconds.

The three preconditions are always the same:

Two or more transactions hold locks.
Each needs a lock the other holds.
The locks can’t be acquired atomically (there’s no single “grab both or grab nothing” operation).

Remove any one and the deadlock can’t form. In practice, that’s the shape of every prevention strategy: reduce the number of locks held concurrently, reduce the duration they’re held, or make the acquisition order consistent across all code paths.

Deadlock vs. lock wait timeout

A deadlock is a cycle. A lock wait timeout is a long queue: transaction A is waiting for transaction B, which is waiting for something reasonable, which is taking too long. No cycle, no victim selection, just a timer expiring. Both produce errors that look similar in application logs, but they’re entirely different failure modes with different fixes. innodb_lock_wait_timeout (MySQL) and lock_timeout (PostgreSQL) govern the second. The deadlock detector is a separate mechanism that fires independently of those timers.

The canonical lock-ordering deadlock

The single most common production deadlock is two transactions updating the same two rows in opposite orders:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
-- Transaction A
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

-- Transaction B (concurrent)
BEGIN;
UPDATE accounts SET balance = balance - 50 WHERE id = 2;
UPDATE accounts SET balance = balance + 50 WHERE id = 1;
COMMIT;

If A and B interleave such that A takes the row-level lock on row 1, B takes the row-level lock on row 2, and each then tries to grab the other row, the engine has a cycle. One gets killed.

sequenceDiagram
    participant A as Transaction A
    participant R1 as Row id=1
    participant R2 as Row id=2
    participant B as Transaction B

    Note over A,B: t=0, both transactions begin
    A->>R1: UPDATE (acquire X-lock)
    R1-->>A: granted
    B->>R2: UPDATE (acquire X-lock)
    R2-->>B: granted

    Note over A,B: t=1, each reaches for the other's row
    A->>R2: UPDATE (request X-lock)
    R2--xA: BLOCKED (held by B)
    B->>R1: UPDATE (request X-lock)
    R1--xB: BLOCKED (held by A)

    Note over A,B: Wait-for graph has a cycle: A → B → A
    Note over A,B: Detector fires; victim: Transaction B
    B->>B: ROLLBACK, release R2 lock
    R2-->>A: now granted
    A->>A: COMMIT

The key property is that neither transaction is wrong in isolation. Each acquires locks in an order that’s locally correct. The cycle forms in the global ordering across concurrent sessions, which no single query can see. That’s the defining shape of the pattern: correct code, in both cases, interacting at the transaction boundary. The fix is in how all code paths agree on an ordering, covered in Part 2: Consistent lock ordering.

InnoDB gap locks turn inserts into deadlock sources

MySQL’s default isolation level is REPEATABLE READ, and under that isolation level, InnoDB takes next-key locks: a row lock plus a gap lock on the range before it. The gap lock prevents other transactions from inserting into that range, which is how REPEATABLE READ keeps range queries consistent across re-execution.

The consequence: a SELECT ... FOR UPDATE or an UPDATE with a range predicate locks not just the matching rows, but the gaps between them. Two concurrent transactions that both try to insert into the same gap can deadlock without sharing a single row:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
-- Table: orders(id BIGINT PK, customer_id BIGINT, amount_cents BIGINT)
-- Index on customer_id
-- Existing rows: customer_id = 5 has orders with ids 100, 200, 300

-- Transaction A
BEGIN;
SELECT * FROM orders WHERE customer_id = 5 FOR UPDATE;
-- Takes next-key locks covering ids 100, 200, 300 AND the gaps between them,
-- AND the gap after 300 extending to the next customer_id.

-- Transaction B (concurrent)
BEGIN;
INSERT INTO orders (id, customer_id, amount_cents) VALUES (250, 5, 10000);
-- Blocks: tries to insert into a gap A has locked.

-- Transaction A
INSERT INTO orders (id, customer_id, amount_cents) VALUES (150, 5, 5000);
-- Deadlock if B has also started acquiring locks that A now needs.

Two transactions inserting into what looks like “different rows” can still cycle through gap locks. The failure mode is especially insidious because the EXPLAIN plan doesn’t mention gaps - only rows - and the lock information in SHOW ENGINE INNODB STATUS requires reading the next-key notation carefully.

The category is peculiar to InnoDB under REPEATABLE READ. PostgreSQL prevents phantom reads through MVCC snapshot isolation starting at its own REPEATABLE READ level (stricter than the SQL standard requires) without any range-locking mechanism, so the whole class of gap-lock deadlocks doesn’t exist on PG, at any isolation level. Under READ COMMITTED on MySQL, gap locks are disabled for most searches and index scans but retained for foreign-key and duplicate-key checking, which is the first lever most teams reach for once this pattern dominates their deadlocks, though it doesn’t eliminate the gap-lock category entirely. The isolation-level trade-off and the “avoid range locks on write paths” refactor both live in Part 2: Isolation-level trade-offs.

Unique-index deadlocks are a category of their own

The detailed patterns are covered in Uniqueness and Selectivity, but the shape worth naming here: when InnoDB detects a duplicate-key error on an INSERT, it acquires a shared lock on the conflicting index record before raising the error. Under REPEATABLE READ that shared lock is next-key (record + gap). Under READ COMMITTED, gap locks are mostly disabled, but duplicate-key checking is one of the documented exceptions where gap locking still occurs, so dropping isolation alone doesn’t eliminate the category. Two concurrent transactions inserting toward the same unique key end up holding shared locks and waiting for each other to release: a deadlock caused entirely by the uniqueness check, not by the rows the application thought it was writing.

INSERT ... ON DUPLICATE KEY UPDATE behaves differently: on conflict it takes an exclusive lock instead of a shared one, because the statement is about to modify the row. This matters for reasoning about cycles. Two concurrent ODKU statements contend on exclusive locks (mutually exclusive, one always waits), whereas two concurrent plain INSERTs can both hold shared locks at once and then deadlock when either tries to upgrade. Blog posts and older documentation sometimes conflate the two; the locking rules are documented in the MySQL reference: Locks Set by Different SQL Statements in InnoDB.

The equivalent in PostgreSQL is less severe (the duplicate-key check doesn’t hold long-lived shared locks the same way) but INSERT ... ON CONFLICT with multiple unique indexes can still produce deadlocks when batches touch overlapping keys in different orders. The shape is the same across engines: the uniqueness check itself is what forces the extra locking, and the cycle forms when two sessions approach the same key from different batches.

Foreign keys take shared locks you didn’t ask for

Both MySQL and PostgreSQL acquire shared locks on the referenced row when you insert or update a row with a foreign key. The purpose is to prevent the referenced row from being deleted mid-transaction; you can’t have an orders.customer_id pointing to a customers.id that’s being concurrently deleted.

The side effect is that a high-write child table concentrates shared locks on hot parent rows:

1
2
3
4
5
-- customers has id = 42 (a frequently-used customer)
-- Many concurrent transactions inserting orders for customer 42:

INSERT INTO orders (customer_id, amount_cents) VALUES (42, 1000);
-- Takes a shared lock on customers(id=42)

Shared locks don’t block each other, so concurrent inserts coexist fine. What breaks is the interaction with any transaction that wants an exclusive lock on the parent row: an update to the customer’s name, a soft-delete, a trigger that updates a cached counter. Suddenly, dozens of shared-lock holders are blocking one exclusive-lock request, and if any of them start trying to acquire other locks (say, through a trigger cascade), a cycle can form.

The symptoms: deadlocks that mention tables far removed from the one the application thought it was touching. “Why is my UPDATE customers deadlocking against an INSERT INTO order_items?” Because the order_items insert took a shared lock on customers through the FK chain, and the UPDATE wanted exclusive on the same row.

This is one of the hardest patterns to diagnose on sight, because the offending query never references the contended table explicitly. Mitigations (narrowing FK scope, moving hot parent-row updates to side tables, isolation-level trade-offs) are covered in Part 2: Hot-row isolation.

Index scans lock more rows than queries return

Under InnoDB’s default REPEATABLE READ, an UPDATE with a WHERE clause on a non-indexed column acquires a record lock on every row it scans, not just the ones that match. The engine has to examine each row to check the predicate, and it takes a lock to guarantee the check is stable for the duration of the transaction.

1
2
3
-- Without an index on status, under REPEATABLE READ:
UPDATE orders SET priority = 'high' WHERE status = 'pending';
-- Locks every row in orders during the scan.

If the table has a million rows and only a thousand match, all million get locked for the duration of the update. Any concurrent transaction touching any of those rows has to wait, which inflates the wait-for graph and makes deadlocks more likely.

Under READ COMMITTED, InnoDB narrows this substantially: per the docs, it releases locks on non-matching rows after the WHERE evaluation and uses semi-consistent reads, returning the latest committed version of an already-locked row so the engine can check whether it matches the WHERE before deciding to wait. The net effect is much lower lock footprint and deadlock risk on the same query. PostgreSQL behaves similarly by default: only rows actually updated retain their locks. This is one of the few cases where the same underlying issue (an unindexed predicate) shows up as both a latency problem and a concurrency problem, and where the concurrency angle is specifically a REPEATABLE READ-on-InnoDB amplifier.

Secondary index locks on InnoDB

InnoDB takes locks on both the clustered index (primary key) and any secondary indexes touched by the query. A WHERE status = 'pending' using a status index locks the relevant index entries and the corresponding PK entries. Transactions that approach the same rows from different indexes (one via status, another via customer_id) can deadlock on the PK-side lock even though their index-side locks don’t overlap. This is the most common “why are these two queries deadlocking, they don’t even reference the same columns?” failure mode.

Parallelism-induced deadlocks

The lock-ordering patterns above assume two separate transactions from two separate sessions. Parallelism adds a few variants that don’t fit that frame; the cycle can form inside a single logical unit of work, or show up only on a replica that never issued the original statements.

Worker pools racing on a shared queue. The archetypal production pattern: N application workers pulling jobs from the same table (jobs, outbox, email_queue) and locking rows for processing. If every worker does SELECT ... FOR UPDATE on “the next available batch” without a deterministic ordering, two workers can grab overlapping row sets in opposite orders and cycle. This is the same lock-ordering cycle from earlier, distributed across workers that all look identical from a code-review perspective.

Intra-query parallel workers. PostgreSQL has a full parallel query executor (parallel sequential scans, bitmap heap scans, index and index-only scans (B-tree), parallel aggregates, parallel joins) that spawns worker processes to cooperate on a single query. MySQL has a much narrower feature: innodb_parallel_read_threads (added in 8.0.14) enables parallel scanning of the clustered index, used initially by CHECK TABLE and extended to unconditional SELECT COUNT(*) in 8.0.17. It is not general parallel query; MySQL does not parallelize arbitrary SELECTs, joins, or aggregates. In both engines, workers coordinating on a single query don’t deadlock among themselves in normal operation; the engine manages the shared lock state. What can happen is a parallel worker holds a lock an unrelated transaction needs, and the parallel query itself takes longer than a serial one would, widening the wait window. Usually not a direct deadlock source, but it changes the timing of existing ones.

Parallel replication on replicas. MySQL’s multi-threaded replica applies committed transactions in parallel. Transactions that committed serially on the source (no possibility of deadlock there) can deadlock on the replica because the applier threads are racing on rows the source never had concurrent writers on. The replica’s deadlock detector resolves them the same way it would a live deadlock, but they show up in the replica’s error log with no corresponding entry on the source. Since MySQL 8.0.27, replica_parallel_type=LOGICAL_CLOCK and replica_parallel_workers=4 are the defaults, and replica_parallel_type was deprecated in 8.0.29; LOGICAL_CLOCK is effectively the only supported mode going forward. The slave_* → replica_* rename happened in 8.0.26; older deployments and blog posts still use the legacy names. PostgreSQL 16+ introduced parallel apply for logical replication (streaming = parallel is the default on CREATE SUBSCRIPTION), which exposes the same class of apply-side cycles on a setup that historically didn’t have them: a surprise for teams upgrading from 15 and earlier.

Parallel/online DDL interacting with DML. Tools like pt-online-schema-change and gh-ost run concurrent DML against the table being altered (through triggers or a row-copy process). Under load, the trigger-installed writes and the copy process can both take locks on the same rows the application is updating, and the wait-for graph gains edges that wouldn’t exist during a normal workload. This rarely manifests as a hard deadlock (the tools are written defensively) but it does manifest as elevated deadlock rates during the migration window.

None of these are properties of the queries themselves. They’re properties of how work gets distributed across workers, processes, or replicas, which means they’re invisible to query-level review and only surface when the deadlock counter is watched over time. The primitives for fixing them (SKIP LOCKED, NOWAIT, advisory locks, DDL timeouts) are covered in Part 2: NOWAIT and SKIP LOCKED.

Engine-level differences that shape the patterns

The same pattern can deadlock on one engine and not the other. These differences are pattern-shaping; they change which of the above sections apply to your workload. Operational tuning (detector cost, wait timeouts, monitoring) is covered in Part 2: Monitoring.

Default isolation. PostgreSQL defaults to READ COMMITTED. MySQL defaults to REPEATABLE READ (with gap locks). The same application code has measurably different deadlock rates between the two because of this alone, before any other tuning.
Gap locks. Only InnoDB has them, and only under REPEATABLE READ (plus the foreign-key and duplicate-key exceptions that retain gap locking even under READ COMMITTED). PostgreSQL prevents phantom reads through MVCC at its own REPEATABLE READ (stricter than the SQL standard requires) without a range-locking mechanism, so the entire gap-lock deadlock category doesn’t exist on PG at any isolation level.
Lock granularity. PostgreSQL takes row-level (tuple) locks; InnoDB takes record locks on index entries (with next-key extension under REPEATABLE READ). The practical consequence is that InnoDB locks are more entangled with index choice than PostgreSQL’s; changing which index a query uses can change which rows and gaps get locked.
FK lock style. MySQL’s FK check holds a shared lock on the referenced row (next-key under REPEATABLE READ, and the docs list FK checking as one of the places gap locks persist even under READ COMMITTED). PostgreSQL takes a FOR KEY SHARE lock (added in 9.3 specifically to reduce FK lock contention vs. the older FOR SHARE). Hot parent rows are more contended under MySQL as a result.
Row-lock visibility. PostgreSQL row-level locks don’t show up in pg_locks. Per the docs, they’re stored on the tuple header on disk, not in shared memory. A process waiting for a row lock usually appears in pg_locks as waiting for the holder’s transaction ID, not the row. InnoDB’s performance_schema.data_locks exposes row-level lock state directly. More on this in Part 2.

Neither engine is “better.” The behaviors are different, and code that assumes one can deadlock mysteriously when moved to the other.

Why schema-reading assistants hit these patterns

Locking behavior has no syntax in the query text. A SELECT ... FOR UPDATE advertises the intent; a plain INSERT ... ON DUPLICATE KEY UPDATE or INSERT ... ON CONFLICT doesn’t. The shared next-key lock on a duplicate-key violation, the FK shared lock on the parent row, the gap-lock extension under MySQL’s default isolation are all implementation details of the storage engine. Schema-reading assistants read the catalog, which describes tables, columns, and constraints, and the codebase, which describes queries. Neither surfaces lock ordering, gap-lock semantics, or the difference between READ COMMITTED and REPEATABLE READ unless the prompt specifically includes them.

That’s why AI-generated UPSERT and batch-insert code deadlocks in production the way it does. The model reads INSERT ... ON DUPLICATE KEY UPDATE as an atomic upsert, not as “takes a shared next-key lock, possibly including a gap, before raising the duplicate-key error that the application will retry.” It generates batch INSERTs that process rows in whatever order the application supplies, not sorted by key: fine for a single writer, a lock-ordering cycle under any realistic concurrency. The patterns above are the ones the catalog and query text can’t warn about, and they’re the ones that arrive in production as “intermittent deadlocks under load” after passing every test that didn’t include a second concurrent worker. The fix lives one level up from the query (sorted batches, explicit lock ordering, retry loops) which is the subject of Part 2.

What’s in Part 2

The patterns are the first half. Turning them into working systems takes a different set of skills: reading the deadlock log to identify which pattern is firing, building retry logic that doesn’t mask the real bug, isolating hot rows before they become incident reports, and choosing the right tool for each (NOWAIT, SKIP LOCKED, advisory locks, counter tables, or the isolation-level change that eliminates the category entirely). PostgreSQL’s SERIALIZABLE/SSI produces serialization failures that look like deadlocks but aren’t; the difference matters for retry architecture. AUTO_INCREMENT and sequence-related locking have their own failure modes. DDL migrations on both engines introduce lock queues that manifest as deadlock-like incidents.

All of that is in Database Deadlocks, Part 2: Diagnosis, Retries, and Prevention.

Mental model for the patterns

Deadlocks are what consistent concurrency control does when two transactions make the engine choose between them. The database isn’t misbehaving; it’s refusing to let both of two contradictory orderings win. The error in the application log is a notification, not a fault.

That makes the diagnostic question concrete. Which pattern is firing? Every deadlock in production fits one of the shapes above: lock-ordering cycle, gap lock on a range, duplicate-key shared lock, FK shared lock on a hot parent, unindexed predicate lock amplification, worker-pool race, or replication-apply cycle. Identifying the pattern from the deadlock log narrows the fix enormously. “Two transactions deadlocked, retry the transaction” is true but useless. “Two workers took locks on the same jobs queue in different orders, switch to SKIP LOCKED” is a fix. The work is in the identification.