SQL

ROW_NUMBER() in PostgreSQL: Pagination, Deduplication and Top-N per Group

PG Monitoring Team May 28, 2026 7 min read

If you have ever written a subquery that fetches "the latest row per group" and felt it was harder than it should be, ROW_NUMBER() is the tool you were missing. It is the most-used window function in PostgreSQL, and once it clicks, a whole class of reporting and data-cleanup problems becomes a single readable query.

What ROW_NUMBER() Does

ROW_NUMBER() assigns a unique, gap-free integer to every row in the result set, in the order defined by the ORDER BY inside the OVER() clause. The crucial difference from GROUP BY: a window function does not collapse rows. Aggregation with GROUP BY turns ten order rows for a customer into one summary row. A window function keeps all ten rows and adds a column — here, a sequence number — computed over a "window" of related rows. You keep the detail and get the calculation.

Syntax

ROW_NUMBER() OVER (
  [PARTITION BY col_or_expr]   -- optional: restart numbering per group
  ORDER BY col_or_expr [ASC | DESC]   -- required: defines the order
)

Two clauses live inside OVER():

  • ORDER BY — mandatory for ROW_NUMBER(). It defines the order in which rows are numbered. Without it the numbering would be non-deterministic.
  • PARTITION BY — optional. It splits the rows into groups and restarts the counter at 1 for each group. Think of it as "GROUP BY for the window," but the rows survive.

Gotcha: if two rows tie on the ORDER BY value, ROW_NUMBER() still gives them different numbers — but which row gets the lower number is arbitrary and can change between runs. If you need a stable, repeatable result, add a tiebreaker column (usually the primary key): ORDER BY created_at DESC, id DESC.

A First Example

SELECT
  id,
  customer_id,
  created_at,
  ROW_NUMBER() OVER (ORDER BY created_at DESC) AS rn
FROM orders;

Every order gets a number from 1 (most recent) to N (oldest). One window, whole table.

PARTITION BY: Numbering Within Groups

The real power shows up with PARTITION BY. The counter resets to 1 at the start of every partition, so you number rows independently per customer, per day, per category:

SELECT
  customer_id,
  id AS order_id,
  created_at,
  ROW_NUMBER() OVER (
    PARTITION BY customer_id
    ORDER BY created_at DESC
  ) AS order_rank
FROM orders;

Now order_rank = 1 is the most recent order for each customer. Customer 7's newest order and customer 42's newest order both get rank 1, in their own partitions.

Use Case 1: Latest Record Per Group (the "greatest-N-per-group" problem)

"Get the most recent order for every customer" is one of the most common — and historically awkward — reporting requests. The classic alternatives are a correlated subquery (slow) or a DISTINCT ON (Postgres-specific). ROW_NUMBER() is the portable, readable answer: number rows within each customer, then keep number 1.

SELECT *
FROM (
  SELECT
    o.*,
    ROW_NUMBER() OVER (
      PARTITION BY customer_id
      ORDER BY created_at DESC
    ) AS rn
  FROM orders o
) ranked
WHERE rn = 1;

Change WHERE rn = 1 to WHERE rn <= 3 and you have "the 3 most recent orders per customer" — a top-N-per-group query — with no extra machinery.

Tip: for the single-latest-row case specifically, PostgreSQL's DISTINCT ON (customer_id) ... ORDER BY customer_id, created_at DESC is often slightly faster and shorter. Reach for ROW_NUMBER() when you need top-N (N > 1), need the rank value itself, or want SQL that ports to other databases.

Use Case 2: Deduplication

Found duplicate rows that should have been unique — say, two user records with the same email? Number the duplicates within each email group and delete everything except the first:

DELETE FROM users u
USING (
  SELECT id,
         ROW_NUMBER() OVER (
           PARTITION BY lower(email)
           ORDER BY created_at   -- keep the oldest; flip to DESC to keep newest
         ) AS rn
  FROM users
) d
WHERE u.id = d.id AND d.rn > 1;

The ORDER BY decides which duplicate survives. Ordering by created_at keeps the original account; using DESC keeps the most recent. Always run the inner SELECT first and eyeball the rows flagged rn > 1 before you commit the DELETE.

Use Case 3: Pagination — and Why OFFSET Hurts

ROW_NUMBER() can drive page-based pagination (WHERE rn BETWEEN 41 AND 60), and it makes the ordering explicit. But on large tables, both ROW_NUMBER() windows and plain OFFSET 100000 LIMIT 20 share the same flaw: the database must produce and walk past the first 100,000 rows just to throw them away. Page 5,000 gets dramatically slower than page 1.

For deep pagination, prefer keyset (cursor) pagination — remember the last value you showed and ask for the next slice directly:

-- Instead of OFFSET, carry the last seen sort key:
SELECT *
FROM orders
WHERE created_at < :last_seen_created_at
ORDER BY created_at DESC
LIMIT 20;

Backed by an index on created_at, every page costs the same — the database jumps straight to the cursor instead of counting from the top.

Performance: Make the Window Use an Index

A window function's PARTITION BY + ORDER BY requires the rows to arrive sorted. PostgreSQL can satisfy that from an index instead of sorting at runtime. A composite index that matches the window lets it skip the Sort entirely:

CREATE INDEX idx_orders_cust_created
  ON orders (customer_id, created_at DESC);

Now confirm it worked with EXPLAIN (ANALYZE, BUFFERS): you want to see a WindowAgg node fed directly by an Index Scan, with no Sort above it. If a Sort appears — especially one that reports Sort Method: external merge Disk — the window is sorting on disk and the query will get slower as the table grows.

From Query to Production

Window functions like ROW_NUMBER() are cheap on paper but degrade silently: the table grows, the supporting index stops being used, and the WindowAgg quietly starts spilling its sort to disk. Nothing errors — the report just gets slower week by week. PG Monitoring tracks each query's execution plan over time and alerts you the moment that Sort goes to disk or the plan flips off the index, so you fix it before users notice.

Related Articles

Ready to experience better PostgreSQL monitoring?

Join thousands of teams who switched from traditional tools to PG Monitoring's AI-powered platform.

Talk to us