Postgres Connection Pool Exhausted Under Load

Q: What pool size should each app instance have?

Start at 5 to 10. Most apps need fewer connections than they think because each backend serves requests sequentially at high throughput. Raise only with evidence of contention (requests queuing for a free connection).

Q: Does PgBouncer work with prepared statements?

Yes, in `transaction` mode with PgBouncer 1.21 or newer, but only if you set `max_prepared_statements` to a non-zero value. Its default is 0, which disables prepared-statement support in transaction/statement pooling. Set it larger than the count of distinct prepared statements your app uses for the best hit rate.

Q: Should I use connection pooling in serverless?

Yes. Use a serverless-friendly or HTTP-based driver (Neon, Supabase client, or point a native driver at RDS Proxy / Supavisor), and keep the per-function pool size at 1. Opening native pools per invocation is what exhausts the database under burst traffic.

Q: What's the difference between `statement_timeout` and `transaction_timeout`?

`statement_timeout` caps a single query; `transaction_timeout` (added in Postgres 17) caps the whole transaction, so a long transaction made of many short statements still gets terminated. They interact: if `transaction_timeout` is set shorter than or equal to `statement_timeout` (or `idle_in_transaction_session_timeout`), the longer one is ignored. Set `transaction_timeout` as the outer ceiling and keep the per-statement and idle limits tighter.

Q: Will raising `max_connections` require a restart?

Yes. `max_connections` is not reloadable — changing it needs a Postgres restart, which is one more reason to reach for a pooler before raising the ceiling.

Postgres throws 'remaining connection slots reserved' under traffic. Fix it by sizing the pool, putting PgBouncer in transaction mode in front, and killing idle-in-transaction connections.

Published: May 23, 2026 Updated: Jun 18, 2026 Author: AI Productivity Guide Team 🌐 查看中文版本

Your app handles 200 RPS comfortably, then traffic doubles and you start seeing FATAL: remaining connection slots are reserved for non-replication superuser connections or FATAL: sorry, too many clients already. Postgres has hit max_connections. New requests fail or hang waiting for a free slot, and existing requests stall when their connection turns out to be dead.

Fastest fix: put a pooler (PgBouncer in transaction mode, or your provider’s managed equivalent) in front of Postgres so a small number of real backends serves thousands of app connections, then shrink each app instance’s pool so instances * pool_size stays under ~70 percent of max_connections. The rest of this guide finds the specific bucket you’re in and stops it from recurring.

Which bucket are you in?

Run this once and the result usually points straight at the cause:

SELECT state, count(*)
FROM pg_stat_activity
GROUP BY state
ORDER BY count(*) DESC;

What you see	Most likely cause	Jump to
Total near `max_connections`, mostly `idle`	App pools too large / no pooler	Step 1, Step 2
High `idle in transaction`	Missing commit/rollback (connection leak)	Step 3
Several `active` running for minutes	Long reporting queries holding slots	Step 4
Many short-lived connections, random source ports	Serverless without pooling	Step 6
Read-heavy `active` queries, replicas near idle	Reads hitting the primary	Step 5

Common causes

Ordered by hit rate.

1. Each app instance opens its own large pool

10 app instances, each with pool size 20, equals 200 connections. The Postgres default max_connections is 100, so you overflow as soon as all instances ramp up.

How to spot it: compare SELECT count(*) FROM pg_stat_activity against SHOW max_connections. A count near max means overflow is imminent.

2. Connections leak from missing transaction commits

A code path opens a transaction, throws an error, and never commits or rolls back. The connection stays in idle in transaction indefinitely, blocking the pool.

How to spot it: SELECT state, count(*) FROM pg_stat_activity GROUP BY state. A high count of idle in transaction means a leak.

3. Long-running queries hold connections

A reporting query that runs for 5 minutes holds a connection the whole time. 20 such queries fill the pool.

How to spot it: SELECT pid, query_start, query FROM pg_stat_activity WHERE state = 'active' ORDER BY query_start LIMIT 10. Queries running over 30s are stragglers.

4. No pooler in front of Postgres

App pools connect directly to Postgres, so every app instance owns its own real backend connections with no multiplexing.

How to spot it: the connection origin in pg_stat_activity (client_addr) is your app servers, not a pooler host.

5. Serverless functions create new connections per invocation

Vercel / Lambda functions without a pooling-aware driver open a fresh connection per cold start. Burst traffic becomes a burst of new connections.

How to spot it: pg_stat_activity shows many short-lived connections from random source ports.

6. Read traffic hitting the primary

All read queries go to the primary when they could be served by replicas, which would free primary connections for writes.

How to spot it: read-heavy queries dominate pg_stat_activity while replica usage sits near zero.

Before you start

Confirm the exact symptom string: too many clients or remaining connection slots reserved.
Note current max_connections and total connections in use.
Identify recent app deploys that may have changed pool size or query patterns.
Check that Postgres itself is healthy: CPU, memory, replication lag.
Plan for a safe change: connection draining strategy and a brief window in case you need to terminate backends.

Information to collect

SHOW max_connections, SHOW shared_buffers, and server flavor (RDS / Aurora, Cloud SQL, Supabase, self-hosted).
SELECT state, count(*) FROM pg_stat_activity GROUP BY state.
App-side pool config: pool size, idle timeout, max lifetime.
Pooler config if present (PgBouncer, Supavisor, RDS Proxy).
Slow query log over the last hour.

Step-by-step fix

Step 1: Size the pool correctly

The constraint to satisfy: app_instances * pool_size_per_instance <= 0.7 * max_connections. Leave headroom because max_connections reserves a few slots for superusers and replication, and the OS needs a couple of slots for maintenance.

// pg Pool config (node-postgres)
import { Pool } from 'pg';

const pool = new Pool({
  max: 10,                       // per app instance
  idleTimeoutMillis: 30000,      // drop idle clients after 30s
  connectionTimeoutMillis: 2000, // fail fast on saturation instead of hanging
});

With 10 app instances and max_connections = 100, a pool size of 7 leaves headroom. Most apps need far fewer connections than they assume: each backend can serve many requests sequentially, so raise the size only with evidence of contention, not preemptively.

Step 2: Put a pooler in front of Postgres

Self-hosted PgBouncer:

# pgbouncer.ini
[databases]
mydb = host=postgres-primary port=5432 dbname=app

[pgbouncer]
listen_port = 6432
listen_addr = *
auth_type = scram-sha-256
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
reserve_pool_size = 5
server_idle_timeout = 600
max_prepared_statements = 200

The app connects to PgBouncer on port 6432. PgBouncer holds 25 real Postgres connections and multiplexes up to 1000 app connections over them.

transaction pool mode is the right default: the server connection is returned to the pool after every transaction, which is what gives you the multiplexing win. Avoid session mode unless you truly need session-scoped state (advisory locks, LISTEN/NOTIFY, temp tables across statements), because it pins one real connection per client and gives up most of the benefit.

Note on auth: auth_type = scram-sha-256 is the current recommended setting as of June 2026. The older md5 still works but is being phased out across Postgres deployments.

Managed equivalents (no PgBouncer to run yourself):

Supabase (Supavisor): connect to the transaction-mode pooler on port 6543; port 5432 is the session-mode/direct route. Since February 2025, port 6543 serves transaction mode only.
AWS RDS / Aurora: enable RDS Proxy in front of the instance. Watch for connection pinning — when a client changes session state the proxy can’t safely share, it pins one backend per client and multiplexing drops. On PostgreSQL, a client-side SET is enough to pin. Check the proxy’s DatabaseConnectionsCurrentlySessionPinned CloudWatch metric; if it’s high, move those SET values into the proxy’s initialization query (applied once per new backend) or onto the database side via ALTER ROLE/ALTER DATABASE, then delete the per-session SET from app code.
Cloud SQL: use the Cloud SQL Auth Proxy plus an app-side pool, or a PgBouncer sidecar.

Step 3: Kill idle-in-transaction connections and prevent new ones

-- See offenders
SELECT pid, now() - state_change AS idle_for, query
FROM pg_stat_activity
WHERE state = 'idle in transaction'
  AND state_change < now() - interval '5 minutes'
ORDER BY state_change;

-- Terminate them
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'idle in transaction'
  AND state_change < now() - interval '5 minutes';

Then set a server-side guard so leaks self-heal:

ALTER SYSTEM SET idle_in_transaction_session_timeout = '60s';
SELECT pg_reload_conf();

Terminating backends only buys time. Fix the leak itself: wrap every transaction in a try/finally (or your ORM’s managed-transaction helper) so the connection is always committed or rolled back, even on the error path.

Step 4: Add timeouts so no query holds a slot forever

-- Cap any single statement, per database
ALTER DATABASE app SET statement_timeout = '30s';

-- Give the reporting role a longer leash
ALTER ROLE reporting SET statement_timeout = '5min';

Per-session in app code, before a known-expensive query:

await pool.query("SET statement_timeout = '10s'");

On Postgres 17 and newer (Postgres 18 has been the current stable major since September 2025; as of June 2026 the patch line is 18.4), also consider transaction_timeout, which was introduced in Postgres 17. It caps the total duration of a transaction regardless of how short the individual statements are, which closes the gap that statement_timeout and idle_in_transaction_session_timeout leave open (a transaction made of many short statements with short pauses can otherwise run indefinitely). One caveat: if transaction_timeout is set shorter than or equal to statement_timeout or idle_in_transaction_session_timeout, the longer of those two is ignored — so set transaction_timeout as the outer bound and keep the others tighter.

ALTER DATABASE app SET transaction_timeout = '2min';

Step 5: Route reads to replicas

import { Pool } from 'pg';

const primary = new Pool({ host: 'postgres-primary' });
const replica = new Pool({ host: 'postgres-replica' });

function getPool(query: string) {
  if (/^\s*(SELECT|EXPLAIN)\b/i.test(query) && !/FOR UPDATE/i.test(query)) {
    return replica;
  }
  return primary;
}

Tag queries explicitly when routing is ambiguous (for example, a SELECT that must read your own just-written row should stay on the primary to avoid replica lag). Moving reads to replicas frees the primary’s slots for writes.

Step 6: Use serverless-friendly drivers

For Vercel / Lambda, where each invocation may be a fresh runtime, avoid opening native Postgres connections per call:

// HTTP-based driver — no persistent TCP connection per invocation
import { neon } from '@neondatabase/serverless';
const sql = neon(process.env.DATABASE_URL);

// or Supabase's client, which talks to the pooler over the transaction port
import { createClient } from '@supabase/supabase-js';
const supabase = createClient(url, key);

If you must use a native driver in serverless, point it at a pooler endpoint (Supavisor transaction port 6543, RDS Proxy, or PgBouncer) rather than the database directly, and keep the per-function pool size at 1.

Step 7: Monitor and alert

-- Connection-state snapshot, run every 30s
SELECT
  count(*) FILTER (WHERE state = 'active') AS active,
  count(*) FILTER (WHERE state = 'idle') AS idle,
  count(*) FILTER (WHERE state = 'idle in transaction') AS idle_tx,
  count(*) FILTER (WHERE wait_event_type = 'Lock') AS waiting
FROM pg_stat_activity;

Alert when total connections reach 80 percent of max_connections, or when idle in transaction exceeds 5. Both are early-warning signs that catch the problem before requests start failing.

How to confirm it’s fixed

Run a load test at 2x typical peak; the pg_stat_activity count should stay under 70 percent of max_connections.
Confirm a high multiplexing ratio at the pooler: many client connections served by far fewer server connections (PgBouncer: SHOW POOLS; on the admin console shows cl_active vs sv_active).
After 10 minutes of normal traffic, the idle in transaction count should be near zero.
The replica receives a meaningful share of read traffic (verify via pg_stat_statements).

Long-term prevention

Standardize pool config across all app instances and document the instances * pool_size math next to it.
Make a pooler (PgBouncer, Supavisor, or RDS Proxy) the default for production, not an afterthought.
Set idle_in_transaction_session_timeout, statement_timeout, and (on Postgres 17 and newer) transaction_timeout at the database level.
Keep a read replica from day one and route reporting and analytics there.
Review pool-exhaustion alerts monthly and tighten whatever trips most.

Common pitfalls

Raising max_connections to 500 to “fix” the problem. An idle backend costs on the order of 10 MB of RAM in practical measurements (the true shared-memory-adjusted overhead is lower, but ~10 MB is a safe planning number), and an active connection running queries or temp tables grows past that, so this trades a connection ceiling for a memory cliff. Above ~200 connections you almost always want a pooler instead.
Using PgBouncer in session mode when transaction would work — it gives up most of the pooling benefit.
Enabling prepared statements in transaction mode but leaving max_prepared_statements = 0 (the default), which silently disables prepared-statement support and triggers errors like prepared statement "..." does not exist. Set it to a value larger than the number of distinct prepared statements your app uses.
Setting statement_timeout so aggressively that it breaks legitimate long-running migrations — scope the long leash to a migration role instead.
Terminating connections without finding the leak source. It will just recur.

FAQ

What pool size should each app instance have? Start at 5 to 10. Most apps need fewer connections than they think because each backend serves requests sequentially at high throughput. Raise only with evidence of contention (requests queuing for a free connection).

Does PgBouncer work with prepared statements? Yes, in transaction mode with PgBouncer 1.21 or newer, but only if you set max_prepared_statements to a non-zero value. Its default is 0, which disables prepared-statement support in transaction/statement pooling. Set it larger than the count of distinct prepared statements your app uses for the best hit rate.

Should I use connection pooling in serverless? Yes. Use a serverless-friendly or HTTP-based driver (Neon, Supabase client, or point a native driver at RDS Proxy / Supavisor), and keep the per-function pool size at 1. Opening native pools per invocation is what exhausts the database under burst traffic.

What’s the difference between statement_timeout and transaction_timeout? statement_timeout caps a single query; transaction_timeout (added in Postgres 17) caps the whole transaction, so a long transaction made of many short statements still gets terminated. They interact: if transaction_timeout is set shorter than or equal to statement_timeout (or idle_in_transaction_session_timeout), the longer one is ignored. Set transaction_timeout as the outer ceiling and keep the per-statement and idle limits tighter.

Will raising max_connections require a restart? Yes. max_connections is not reloadable — changing it needs a Postgres restart, which is one more reason to reach for a pooler before raising the ceiling.

External references: the PostgreSQL pg_stat_activity view docs, the client connection defaults (where statement_timeout, idle_in_transaction_session_timeout, and transaction_timeout are defined), the PgBouncer configuration reference, and the RDS Proxy pinning guide are the authoritative sources for the fields and settings above.

Tags: #Backend #Troubleshooting #postgres