Published on

DB Connection Pool Exhaustion — Why Your App Hangs at Peak Load

Authors

Introduction

Everything runs fine in staging. Then production traffic picks up, and your app starts hanging. No errors in the logs — requests just stop responding and time out. Your database is alive and healthy. But your app is frozen.

The culprit? Database connection pool exhaustion.

What is a Connection Pool?

Opening a new database connection is expensive — it involves TCP handshakes, authentication, and memory allocation on both sides. Connection pools pre-open a set of connections and reuse them:

Your App
  ├── Connection Pool (max: 10 connections)
  │     ├── conn-1  [IN USE by Request A]
  │     ├── conn-2  [IN USE by Request B]
  │     ├── conn-3  [IDLE — available]
  │     └── ...
  └── PostgreSQL Server

When all connections are in use and a new request arrives, it waits in a queue. If the wait exceeds the timeout, the request fails. If the queue is unbounded, requests pile up forever.

How Exhaustion Happens

Max pool size: 10 connections
Traffic: 200 concurrent requests

200 requests arrive → first 10 grab connections → 190 wait
10 slow queries (3s each)190 requests wait 3s+
Meanwhile, 200 MORE requests arrive → 390 waiting
Pool never frees up → timeout cascade → app freeze

The tragedy: this can happen with a perfectly healthy database running fast queries.

Diagnosis

Signs of pool exhaustion:

  • Requests hang for exactly connectionTimeoutMillis then fail
  • DB CPU is low (it's not even getting the queries)
  • Logs show "timeout acquiring connection" errors
  • pg_stat_activity shows connections in idle state — held but not querying
-- Check active connections on PostgreSQL
SELECT count(*), state, wait_event_type, wait_event
FROM pg_stat_activity
WHERE datname = 'your_db'
GROUP BY state, wait_event_type, wait_event;

-- Check max connections allowed
SHOW max_connections;

-- See who's holding connections
SELECT pid, usename, application_name, state, query_start, query
FROM pg_stat_activity
WHERE datname = 'your_db'
ORDER BY query_start;
// Add pool event logging (node-postgres)
const pool = new Pool({ ... })

pool.on('connect', () => console.log('New DB connection established'))
pool.on('acquire', () => console.log('Connection acquired from pool'))
pool.on('remove', () => console.log('Connection removed from pool'))

// Monitor pool size
setInterval(() => {
  console.log({
    total: pool.totalCount,
    idle: pool.idleCount,
    waiting: pool.waitingCount,
  })
}, 5000)

Fix 1: Right-Size Your Pool

The most common mistake — setting pool size too high OR too low:

import { Pool } from 'pg'

// ❌ Common mistake — setting max too high
const pool = new Pool({ max: 100 })
// PostgreSQL's default max_connections is 100.
// 10 Node.js instances × 100 connections = 1000 connections requested
// PostgreSQL rejects connections beyond its limit → crash

// ✅ Right-sizing formula:
// pool_size = (num_cpu_cores × 2) + effective_spindle_count
// For PostgreSQL on a 4-core server: (4 × 2) + 1 ≈ 10
// If you have 5 Node.js instances: 10 / 5 = 2 per instance

const pool = new Pool({
  max: 10,                    // Per instance
  min: 2,                     // Keep 2 warm
  connectionTimeoutMillis: 5000,  // Fail fast after 5s wait
  idleTimeoutMillis: 30_000,  // Release idle connections after 30s
  allowExitOnIdle: true,
})

Key insight: More connections ≠ better performance. PostgreSQL uses OS threads per connection — too many connections means more context-switching overhead, not more throughput.

Fix 2: PgBouncer — Connection Pooling Proxy

For high-concurrency apps, put PgBouncer between your app and PostgreSQL:

App instances (each wants 50 connections)
    PgBouncer (pools to 10 real DB connections)
    PostgreSQL (max_connections: 100)

PgBouncer multiplexes hundreds of application "connections" into a small number of real DB connections:

# pgbouncer.ini
[databases]
mydb = host=localhost port=5432 dbname=mydb

[pgbouncer]
pool_mode = transaction    # Best for web apps (connection per transaction)
max_client_conn = 1000    # App can request up to 1000 connections
default_pool_size = 20    # But only 20 real DB connections
min_pool_size = 5
server_idle_timeout = 600
log_connections = 1

With transaction pool mode, a connection is only held during an actual transaction — then returned immediately. Perfect for web API patterns.

Fix 3: Release Connections Properly

Every client.connect() must be followed by client.release() — even on error:

// ❌ Connection leak — error path never releases
async function getUser(id: number) {
  const client = await pool.connect()
  const result = await client.query('SELECT * FROM users WHERE id = $1', [id])
  // If query throws, client is never released! Pool leaks.
  client.release()
  return result.rows[0]
}

// ✅ Always use try/finally
async function getUser(id: number) {
  const client = await pool.connect()
  try {
    const result = await client.query('SELECT * FROM users WHERE id = $1', [id])
    return result.rows[0]
  } finally {
    client.release()  // Always runs, even if query throws
  }
}

// ✅ Even better — use pool.query() for single queries (auto-releases)
async function getUser(id: number) {
  const result = await pool.query('SELECT * FROM users WHERE id = $1', [id])
  return result.rows[0]
}

Fix 4: Query Timeouts — Don't Hold Connections Forever

const pool = new Pool({
  max: 10,
  statement_timeout: 5000,     // Kill queries running longer than 5s
  query_timeout: 10_000,       // Timeout waiting for query result
  connectionTimeoutMillis: 3000, // Timeout waiting for a connection
})

// Per-query timeout
await pool.query({
  text: 'SELECT * FROM reports WHERE date > $1',
  values: [startDate],
  // This query has its own timeout
})

// Set timeout in SQL itself
await pool.query('SET statement_timeout = 3000')
await pool.query('SELECT pg_sleep(10)')  // Killed after 3s

Fix 5: Use Connection per Request Middleware

Ensure connections are always released at the end of each HTTP request:

import { Request, Response, NextFunction } from 'express'
import { PoolClient } from 'pg'

declare global {
  namespace Express {
    interface Request {
      db: PoolClient
    }
  }
}

// Middleware: acquire connection at start, release at end
export async function dbMiddleware(
  req: Request,
  res: Response,
  next: NextFunction
) {
  req.db = await pool.connect()

  // Release when response finishes (success OR error)
  res.on('finish', () => req.db.release())
  res.on('close', () => req.db.release())

  next()
}

// In routes — connection always auto-released
app.get('/users/:id', dbMiddleware, async (req, res) => {
  const user = await req.db.query('SELECT * FROM users WHERE id = $1', [req.params.id])
  res.json(user.rows[0])
})

Fix 6: Async Queue for Controlled Concurrency

Limit how many DB operations run simultaneously:

import PQueue from 'p-queue'

// Max 10 concurrent DB operations — matches pool size
const dbQueue = new PQueue({ concurrency: 10 })

async function queuedQuery<T>(fn: () => Promise<T>): Promise<T> {
  return dbQueue.add(fn)
}

// Usage — won't exceed pool size
await Promise.all(
  userIds.map(id =>
    queuedQuery(() => pool.query('SELECT * FROM users WHERE id = $1', [id]))
  )
)

Monitoring Dashboard

// Expose pool metrics as Prometheus-style endpoint
app.get('/metrics', (req, res) => {
  res.json({
    db_pool_total: pool.totalCount,
    db_pool_idle: pool.idleCount,
    db_pool_waiting: pool.waitingCount,
    db_pool_utilization: (pool.totalCount - pool.idleCount) / pool.totalCount,
  })
})

// Alert if waiting queue grows
setInterval(() => {
  if (pool.waitingCount > 5) {
    logger.warn(`DB pool pressure: ${pool.waitingCount} requests waiting`)
  }
  if (pool.waitingCount > 20) {
    logger.alert(`DB POOL EXHAUSTED: ${pool.waitingCount} requests blocked`)
  }
}, 1000)

Conclusion

Connection pool exhaustion is a silent killer — your database is fine, but your app freezes. The fixes are layered: right-size your pool, add PgBouncer for high concurrency, always release connections in finally blocks, set query timeouts, and monitor pool pressure continuously. Get these right and your app will handle 10x the traffic without breaking a sweat.