EXPLAIN vs EXPLAIN ANALYZE
EXPLAIN prints the plan the planner intends to use, with estimated costs and row counts — it does not run the query. EXPLAIN ANALYZE actually executes it and adds the real timings and row counts, so you can compare expectation against reality.
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders WHERE customer_id = 42;
Always add BUFFERS — it reveals how much data came from cache (shared hit) versus disk (read), which is often the real story behind a slow query.
Warning: ANALYZE runs the statement. For UPDATE/DELETE, wrap it in a transaction and ROLLBACK.
The Single Most Important Number: Estimated vs Actual Rows
Seq Scan on orders (cost=0.00..18334.00 rows=12 width=64)
(actual time=0.011..52.3 rows=48000 loops=1)
The planner expected 12 rows; 48,000 came back. That 4,000× misestimate is the root cause of most bad plans — PostgreSQL chose a strategy that only makes sense for a handful of rows. Fix it by running ANALYZE orders; to refresh statistics, increasing the statistics target, or rewriting a condition the planner cannot estimate (functions on columns, correlated predicates).
Scan Types, From Best to Worst
- Index Only Scan — answered entirely from the index. Fastest.
- Index Scan — index finds rows, then fetches from the heap.
- Bitmap Heap Scan — good for medium selectivity; gathers matches then reads the heap in physical order.
- Seq Scan — reads the whole table. Correct for small tables or low-selectivity filters; a red flag on a large table with a selective
WHERE.
Three Patterns That Mean Trouble
1. Sort spilling to disk
Sort (actual rows=2000000 ...)
Sort Method: external merge Disk: 168000kB
external merge … Disk means work_mem was too small and the sort spilled. Either raise work_mem for the session or add an index that provides the rows pre-sorted.
2. Nested Loop with a big inner side
A Nested Loop is great when the outer side is tiny. If loops= is large and each iteration scans many rows, the planner mis-estimated — a Hash Join or Merge Join would be far cheaper. Fix the estimate and the join method usually fixes itself.
3. Rows Removed by Filter
Seq Scan on events
Filter: (status = 'error')
Rows Removed by Filter: 9913000
Reading 10M rows to keep 87,000 is a textbook case for a (possibly partial) index on status.
A Practical Reading Order
- Find the node with the largest
actual timegap — read plans inside-out, deepest node first. - Compare estimated vs actual rows at that node. Big gap? Fix statistics first.
- Check for
Disksorts and highRows Removed by Filter. - Only then consider adding an index — and re-run
EXPLAIN ANALYZEto confirm it is used.
Reading Plans You Never Ran
The hard part in production is that the slow query already finished — by the time you copy it into EXPLAIN ANALYZE, conditions have changed. PG Monitoring captures the actual plan for every slow execution as it happens and tracks plan changes over time, so you see the exact plan that was slow, not a reconstruction.