Linux for Database Admins: The Commands You Reach For at 3 AM

When a database goes slow at 3 AM, the fastest diagnosis often comes not from SQL but from a handful of Linux commands. This is the focused toolkit every DBA should have in muscle memory — enough to tell, in two minutes, whether the problem is the database or the server underneath it.

The Database Server Is Still a Linux Box

Half of "the database is slow" tickets are really "the server it runs on is starved." Before you tune a single Postgres parameter, rule out the operating system. Here is the minimal toolkit.

CPU and Load

uptime          # load average: 1, 5, 15 min. Compare to core count (nproc)
top -o %CPU     # live, sorted by CPU. Press '1' to see per-core
mpstat 1 5      # per-core utilization, spot a single saturated core

A load average above your core count means processes are queuing for CPU. If %iowait in top is high, the bottleneck is disk, not CPU.

Memory and the OOM Killer

free -h         # used / free / available — watch 'available', not 'free'
dmesg -T | grep -i 'oom\|killed process'

That dmesg line is the one that solves mysteries. If the kernel's OOM killer terminated a postgres backend, you will see it here — the database "crash" was actually the OS reclaiming memory. The fix is usually lowering work_mem × connections, not the database itself.

Disk I/O — Usually the Real Culprit

iostat -xz 1    # %util near 100 = saturated disk; await = latency per I/O
df -h           # is the data/WAL partition full?
du -sh /var/lib/postgresql/*/main/pg_wal   # is WAL ballooning?

A nearly-full or 100%-utilized disk explains more "database" incidents than any query. A growing pg_wal directory points straight at an inactive replication slot.

Connections and Ports

ss -tnp state established '( dport = :5432 or sport = :5432 )' | wc -l
ss -s           # socket summary, total established connections

Counting real established connections to port 5432 from the OS side cross-checks what pg_stat_activity tells you — useful when an app leaks connections.

Following the Logs

journalctl -u postgresql -f --since "10 min ago"
tail -f /var/lib/postgresql/*/main/log/postgresql-*.log

From Manual Checks to Continuous Visibility

These commands are perfect for a live incident, but running them by hand means you only look after something breaks. PG Monitoring collects the same host signals — CPU, memory, disk I/O, and OOM events — alongside database metrics and correlates them, so when a query slows down you immediately see whether the cause is the database or the box underneath it.

Linux for Database Admins: The Commands You Reach For at 3 AM

The Database Server Is Still a Linux Box

CPU and Load

Memory and the OOM Killer

Disk I/O — Usually the Real Culprit

Connections and Ports

Following the Logs

From Manual Checks to Continuous Visibility

Share this article

Related Articles

PostgreSQL generate_series: Fill Time Gaps, Build Calendars, and Test Data

PostgreSQL date_trunc: Time Buckets Without Breaking Indexes

PostgreSQL JSONB: Query Nested Data and Choose the Right GIN Index

Ready to experience better PostgreSQL monitoring?