Your Alert Triage Doesn't Need an Autonomous Agent

A team's AI ops agent has access to logs, metrics, deploys, traces. Six months in, MTTR is unchanged and 'responder trusted the summary' shows up in three of the last ten post-mortems. More access did not make the summaries more correct.

Corruption Is a Feature, Not a Bug: Why LLMs Corrupt by Design

Try making Claude Code stop saying 'honestly'. Put it in CLAUDE.md, write a skill, set a system reminder, drop it into project memory - the model still says 'honestly'. That's the smallest reproducible demonstration that the prompt sits on top of a system you cannot actually override.

Designing Partitioning You Don't Have to Babysit

Six months in, p_future holds 800M rows because the growth projection didn't survive the workload, and every ALTER to fix it needs a maintenance window nobody wants to schedule. The boundary management is two lines of DDL; the harder part is picking a partition key that doesn't leak into application code.

Testing Your Database, Part 2: What to Test, and How

Squawk catches the locking ALTER. pgTAP catches the missing UNIQUE. Testcontainers with a prod-shaped snapshot catches the migration that takes 40 minutes against real volume. Five categories of database test, each invisible to the others, each addressed by tools that have been stable for years and that almost no team has assembled into one suite.