The Database Server Is Still a Linux Box
Half of "the database is slow" tickets are really "the server it runs on is starved." Before you tune a single Postgres parameter, rule out the operating system. Here is the minimal toolkit.
CPU and Load
uptime # load average: 1, 5, 15 min. Compare to core count (nproc)
top -o %CPU # live, sorted by CPU. Press '1' to see per-core
mpstat 1 5 # per-core utilization, spot a single saturated core
A load average above your core count means processes are queuing for CPU. If %iowait in top is high, the bottleneck is disk, not CPU.
Memory and the OOM Killer
free -h # used / free / available — watch 'available', not 'free'
dmesg -T | grep -i 'oom\|killed process'
That dmesg line is the one that solves mysteries. If the kernel's OOM killer terminated a postgres backend, you will see it here — the database "crash" was actually the OS reclaiming memory. The fix is usually lowering work_mem × connections, not the database itself.
Disk I/O — Usually the Real Culprit
iostat -xz 1 # %util near 100 = saturated disk; await = latency per I/O
df -h # is the data/WAL partition full?
du -sh /var/lib/postgresql/*/main/pg_wal # is WAL ballooning?
A nearly-full or 100%-utilized disk explains more "database" incidents than any query. A growing pg_wal directory points straight at an inactive replication slot.
Connections and Ports
ss -tnp state established '( dport = :5432 or sport = :5432 )' | wc -l
ss -s # socket summary, total established connections
Counting real established connections to port 5432 from the OS side cross-checks what pg_stat_activity tells you — useful when an app leaks connections.
Following the Logs
journalctl -u postgresql -f --since "10 min ago"
tail -f /var/lib/postgresql/*/main/log/postgresql-*.log
From Manual Checks to Continuous Visibility
These commands are perfect for a live incident, but running them by hand means you only look after something breaks. PG Monitoring collects the same host signals — CPU, memory, disk I/O, and OOM events — alongside database metrics and correlates them, so when a query slows down you immediately see whether the cause is the database or the box underneath it.