Page 2 - EXPLAIN ANALYZE

Query Performance Database Architecture

It's Almost Always the Queries, Part III: When the CPU Is Pegged

Forty analysts with a dashboard auto-refreshing, twelve microservices each polling a status badge, a connection pool sized four times past the core count. None of it trips the slow-query log, and all of it lands on the same cores.

The Paradox of the Fast Engineer

METR's controlled study of sixteen experienced developers reported a 20% AI speedup; the stopwatch measured them 19% slower. The judgment that closes that gap is the deposit the agent now skips.

Query Performance Database Architecture

It's Almost Always the Queries, Part II: Troubleshooting Steps

A runbook for software engineers who don't troubleshoot database performance often. What to check first, what the output means, and how to narrow down the cause before you act on it.

Query Performance Database Architecture

It's Almost Always the Queries, Part I: Why Metal Doesn't Help

A team adds a third read replica because the primary is at 95% CPU. Lag gets worse. The slow-query log is empty. The first stop is `pg_stat_statements`, not the cloud console. This is the framing for the next five posts in the series.

AI System Design Database Architecture

Exposing Data to an Agent: MCP vs API

An MCP server that exposes query(sql) against the prod replica is a wire, not a policy. The model writes whatever SQL the corruption floor produces, the rows it returns land in the LLM provider's trace storage, and the audit log shows one tool call from one service account with no prompt attached.

AI Platform Engineering

The 10x Is Real, on Internal Tools You'd Otherwise Never Ship

An AZ backup shipper that has sat in the toil backlog for two quarters lands in an afternoon. A diagnostic API in front of Redis the team always wanted but never staffed. The 10x claim collapses on the customer-facing PR, holds on the throwaway tool.

Engineering Management Team Process AI

Why You Should Use an Agent to Assist with Quarterly Reviews

An agent can read 12 months of 1:1 notes, 800 tickets, and 200 PRs in an hour. It will also confidently misattribute work to the wrong report, quote a 1:1 line that was said by someone else, and harmonize three engineers' contributions into one name. Both facts are true. The article is about using the first one without shipping the second.

Database Architecture AI Cost Optimization

Hidden Database Costs of an AI Rollout: Storage, CPU, Memory, and Cache

A team builds an HNSW index on 17 million 1536-dimensional vectors. Forty-eight CPUs, 192 GB of RAM, two hours in, the connection drops. The four cost categories the obvious capacity plan misses are storage, build-time CPU, the memory cliff, and the buffer cache the OLTP traffic used to share.

Your Alert Triage Doesn't Need an Autonomous Agent

A team's AI ops agent has access to logs, metrics, deploys, traces. Six months in, MTTR is unchanged and 'responder trusted the summary' shows up in three of the last ten post-mortems. More access did not make the summaries more correct.

AI System Design Security

If Your Guardrail Is a Prompt, You Don't Have a Guardrail

Eleven all-caps 'do not touch production' messages. An agent that touched it anyway. 1,206 dropped executive accounts. Adding a twelfth message would not have helped.