Linux

Linux for Database Admins: The Commands You Reach For at 3 AM

PG Monitoring Team May 12, 2026 8 min read

The Database Server Is Still a Linux Box

Half of "the database is slow" tickets are really "the server it runs on is starved." Before you tune a single Postgres parameter, rule out the operating system. Here is the minimal toolkit.

CPU and Load

uptime          # load average: 1, 5, 15 min. Compare to core count (nproc)
top -o %CPU     # live, sorted by CPU. Press '1' to see per-core
mpstat 1 5      # per-core utilization, spot a single saturated core

A load average above your core count means processes are queuing for CPU. If %iowait in top is high, the bottleneck is disk, not CPU.

Memory and the OOM Killer

free -h         # used / free / available — watch 'available', not 'free'
dmesg -T | grep -i 'oom\|killed process'

That dmesg line is the one that solves mysteries. If the kernel's OOM killer terminated a postgres backend, you will see it here — the database "crash" was actually the OS reclaiming memory. The fix is usually lowering work_mem × connections, not the database itself.

Disk I/O — Usually the Real Culprit

iostat -xz 1    # %util near 100 = saturated disk; await = latency per I/O
df -h           # is the data/WAL partition full?
du -sh /var/lib/postgresql/*/main/pg_wal   # is WAL ballooning?

A nearly-full or 100%-utilized disk explains more "database" incidents than any query. A growing pg_wal directory points straight at an inactive replication slot.

Connections and Ports

ss -tnp state established '( dport = :5432 or sport = :5432 )' | wc -l
ss -s           # socket summary, total established connections

Counting real established connections to port 5432 from the OS side cross-checks what pg_stat_activity tells you — useful when an app leaks connections.

Following the Logs

journalctl -u postgresql -f --since "10 min ago"
tail -f /var/lib/postgresql/*/main/log/postgresql-*.log

From Manual Checks to Continuous Visibility

These commands are perfect for a live incident, but running them by hand means you only look after something breaks. PG Monitoring collects the same host signals — CPU, memory, disk I/O, and OOM events — alongside database metrics and correlates them, so when a query slows down you immediately see whether the cause is the database or the box underneath it.

Related Articles

Replication

PostgreSQL Streaming Replication, Step by Step

A practical walkthrough of setting up a physical streaming replica: primary configuration, pg_basebackup, replication slots, and how to verify the standby is actually caught up.

Ready to experience better PostgreSQL monitoring?

Join thousands of teams who switched from traditional tools to PG Monitoring's AI-powered platform.

Talk to us