Maintenance

Bloat Detection and Autovacuum Optimization

PG Monitoring Team March 28, 2026 6 min read

The Hidden Performance Killer: Table Bloat

PostgreSQL's MVCC architecture creates table bloat as a natural byproduct. Every UPDATE creates a new row version, and dead tuples accumulate until vacuum removes them. The problem?

By the time you notice bloat, your performance has already degraded 30-50%.

Traditional Bloat Detection (Manual)

Method 1: pgstattuple Extension

CREATE EXTENSION pgstattuple;
SELECT * FROM pgstattuple('big_table');

 table_len | tuple_count | tuple_len | dead_tuple_count | free_space | free_percent 
-----------+-------------+-----------+------------------+------------+--------------
  2147483648 |     5000000 |  850000000 |          1200000 |  450000000 |         20.9

Problems:

  • Requires superuser privileges
  • Locks table while scanning (bad for production)
  • Manual execution only
  • No historical tracking

Method 2: pg_stat_user_tables

SELECT schemaname, relname, n_live_tup, n_dead_tup,
       round(n_dead_tup::numeric/nullif(n_live_tup,0)*100, 2) as dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC;

Problems:

  • Only shows tuple counts, not space usage
  • No percentage of table size
  • Stats may be stale
  • Doesn't show which tables need immediate attention

Method 3: Periodic VACUUM FULL

Some DBAs just schedule weekly VACUUM FULL on all tables:

  • Requires exclusive table locks (downtime!)
  • Wastes time on tables that don't need it
  • Misses tables that bloat faster than weekly

PG Monitoring's Automated Bloat Detection

1. Continuous Bloat Estimation

Uses lightweight pgstattuple_approx when available, falls back to statistical estimation:

Table: orders
Live tuples: 4,200,000
Dead tuples: 1,847,000 (30.5% of table)
Bloat ratio: 43% (table is 43% bigger than necessary)
Trend: +2.3% per day
Prediction: Will reach 60% bloat in 7 days

Recommendation: Schedule VACUUM during next maintenance window

2. Autovacuum Effectiveness Analysis

Not just bloat detection - understanding WHY bloat is happening:

Table: events
Bloat growth rate: 5% per day
Autovacuum last run: 2 days ago
Autovacuum threshold: 50 + 0.1 × 10M rows = 1,000,050 dead tuples
Current dead tuples: 847,000 (below threshold)

Problem: Threshold too high for high-churn table
Recommendation: Lower autovacuum_vacuum_scale_factor
                from 0.1 to 0.05 for this table

3. Vacuum Queue Optimization

Prioritizes which tables to vacuum first:

Priority Queue:
1. orders (URGENT): 43% bloat, most frequently scanned
2. events (HIGH): 38% bloat, growing fast
3. logs (MEDIUM): 25% bloat, append-only pattern

Estimated recovery: 2.3GB disk space, 15% query speedup

Feature Comparison

Featurepgstattuplepg_stat_tablesManual ScriptsPG Monitoring
Continuous monitoringStats only
Space usage estimationPartial
Bloat percentageManualPartialManual
Trend tracking
Autovacuum analysisPartial
Priority queue
Predictive alerts

Case Study: The 2 AM Slowdown Mystery

An e-commerce platform had mysterious 2 AM performance degradation every few days.

Traditional monitoring: Showed high I/O at 2 AM, but couldn't explain why.

PG Monitoring bloat analysis revealed:

Pattern:
- Daily batch job updates 500K order records at 01:30
- Autovacuum couldn't keep up (threshold too high)
- By day 3, orders table reached 60% bloat
- Sequential scans became 3x slower
- Index scans degraded to heap fetches

Cascade effect:
- Slow queries → connection pool exhaustion
- Application timeouts → retry storms
- Database overload at 02:00

Solution: Adjusted autovacuum_vacuum_scale_factor from 0.1 to 0.05 for high-churn tables. Added dedicated autovacuum workers for orders table.

Result: Bloat stays under 15%, 2 AM slowdown eliminated.

The Bottom Line

Traditional approach: "Run this query to check bloat... hmm, orders looks bad... should I vacuum it?"

PG Monitoring: "Orders table bloat at 43% and trending +2.3% daily. Will reach critical 60% in 7 days. Autovacuum threshold too high for this 10M row table. Recommend scale_factor 0.1 → 0.05. Estimated benefit: 2.3GB disk + 15% query speedup."

Related Articles

Ready to experience better PostgreSQL monitoring?

Join thousands of teams who switched from traditional tools to PG Monitoring's AI-powered platform.

Talk to us