Published on

Memory Leak in Production — How to Find and Fix It

Authors

Introduction

Your Node.js service starts fine. After 6 hours, response times creep up. After 12 hours, it's OOM-killed. You restart it. Repeat. Every restart buys a few more hours.

This is a memory leak — and it will keep happening until you find the root cause.

How Memory Leaks Happen in Node.js

Node.js uses V8's garbage collector to automatically free unused memory. A memory leak occurs when objects are still referenced (intentionally or not) even though you're done with them. GC can't collect what it can't see as garbage.

Common causes:

  1. Unbounded caches — Maps/objects that grow forever without eviction
  2. Event listener leakson('event', handler) never removed
  3. Closures holding references — Variables captured in closures that outlive their use
  4. Global variables — Accidentally assigned without let/const
  5. Timers not clearedsetInterval that references large objects
  6. Circular references with external resources
  7. Stream not consumed or closed

Step 1: Confirm You Have a Leak

// Monitor heap usage over time
import v8 from 'v8'
import process from 'process'

setInterval(() => {
  const heapStats = v8.getHeapStatistics()
  const used = Math.round(heapStats.used_heap_size / 1024 / 1024)
  const total = Math.round(heapStats.total_heap_size / 1024 / 1024)

  console.log(`Heap: ${used}MB / ${total}MB`)
}, 30_000)

// Expose as metrics endpoint
app.get('/metrics', (req, res) => {
  const mem = process.memoryUsage()
  res.json({
    rss: Math.round(mem.rss / 1024 / 1024) + 'MB',       // Total process memory
    heapUsed: Math.round(mem.heapUsed / 1024 / 1024) + 'MB',
    heapTotal: Math.round(mem.heapTotal / 1024 / 1024) + 'MB',
    external: Math.round(mem.external / 1024 / 1024) + 'MB',  // C++ objects
  })
})

If heapUsed grows steadily over time without leveling off → you have a leak.

Step 2: Heap Snapshot Analysis

import v8 from 'v8'
import fs from 'fs'

// Trigger a heap snapshot via API endpoint
app.get('/debug/heap-snapshot', (req, res) => {
  const filename = `heap-${Date.now()}.heapsnapshot`
  const snapshotStream = v8.writeHeapSnapshot(filename)
  res.json({ file: snapshotStream })
})
  1. Take snapshot when memory is baseline (say 200MB)
  2. Run traffic for 1 hour
  3. Take another snapshot (now 600MB)
  4. Open both in Chrome DevTools → Memory tab → Compare
  5. Objects with large "Retained Size" delta are your leak

Step 3: Common Leaks and How to Fix Them

Leak 1: Unbounded Cache / Map

// ❌ LEAK — cache grows forever
const cache = new Map()

app.get('/user/:id', async (req, res) => {
  const { id } = req.params

  if (!cache.has(id)) {
    cache.set(id, await db.user.findById(id))
  }
  // Cache is never evicted — millions of users = GB of memory

  res.json(cache.get(id))
})

// ✅ FIX — Use LRU cache with size limit
import LRU from 'lru-cache'

const cache = new LRU<string, User>({
  max: 1000,         // Max 1000 entries
  ttl: 5 * 60_000,  // 5 minute TTL
})

app.get('/user/:id', async (req, res) => {
  const cached = cache.get(req.params.id)
  if (cached) return res.json(cached)

  const user = await db.user.findById(req.params.id)
  cache.set(req.params.id, user)
  res.json(user)
})

Leak 2: Event Listener Accumulation

// ❌ LEAK — new listener added on every request, never removed
app.get('/subscribe', (req, res) => {
  emitter.on('data', (data) => {
    res.write(data)  // This handler is added each time!
  })
  // Handler is never removed when request ends
  // After 10,000 requests → 10,000 listeners on 'data'
})

// ✅ FIX — Remove listener on cleanup
app.get('/subscribe', (req, res) => {
  const handler = (data: string) => res.write(data)

  emitter.on('data', handler)

  // Remove listener when request closes
  req.on('close', () => {
    emitter.off('data', handler)
  })
})

// ✅ Check listener count
emitter.on('data', handler)
if (emitter.listenerCount('data') > 10) {
  console.warn(`High listener count: ${emitter.listenerCount('data')}`)
}

Leak 3: Closure Holding Large Objects

// ❌ LEAK — closure keeps large array in memory
function processReport() {
  const hugeDataset = loadMillionRecords()  // 500MB in memory

  return function getStats() {
    // This function captures hugeDataset in its closure
    // Even if we only need the count, the whole dataset is retained
    return { count: hugeDataset.length }
  }
}

const getStats = processReport()
// hugeDataset is NEVER GC'd because getStats still references it!

// ✅ FIX — Extract what you need, release the rest
function processReport() {
  const hugeDataset = loadMillionRecords()
  const count = hugeDataset.length  // Extract what we need

  return function getStats() {
    return { count }  // Only 'count' is captured, not hugeDataset
  }
  // hugeDataset goes out of scope and is GC'd
}

Leak 4: setInterval Without Cleanup

// ❌ LEAK — interval references large object forever
export class ReportService {
  private reports: Report[] = []

  startPolling() {
    setInterval(() => {
      this.reports.push(this.generateReport())
      // reports array grows indefinitely!
    }, 5000)
  }
}

// ✅ FIX — Store interval ID and clean up
export class ReportService {
  private reports: Report[] = []
  private intervalId: NodeJS.Timeout | null = null

  startPolling() {
    this.intervalId = setInterval(() => {
      const report = this.generateReport()
      this.reports.push(report)

      // Keep only last 100 reports
      if (this.reports.length > 100) {
        this.reports = this.reports.slice(-100)
      }
    }, 5000)
  }

  stopPolling() {
    if (this.intervalId) {
      clearInterval(this.intervalId)
      this.intervalId = null
    }
  }
}

Leak 5: Unhandled Stream

// ❌ LEAK — readable stream not consumed, buffers fill memory
const stream = fs.createReadStream('huge-file.csv')
// If nobody reads this stream, data buffers in memory forever!

// ✅ FIX — Always consume or close streams
stream.on('data', chunk => processChunk(chunk))
stream.on('end', () => console.log('Done'))
stream.on('error', err => {
  console.error(err)
  stream.destroy()  // Always destroy on error
})

// Or pipe to a destination
stream.pipe(processStream).pipe(outputStream)

Step 4: WeakMap and WeakRef for Cache

Use WeakMap for metadata caches — GC can reclaim values when keys are collected:

// ❌ Regular Map holds strong reference
const metadata = new Map<Request, RequestMeta>()
// req objects are never GC'd as long as they're in the Map

// ✅ WeakMap — values are GC'd when key is GC'd
const metadata = new WeakMap<Request, RequestMeta>()
// req is GC'd normally, metadata goes with it

// WeakRef — reference without preventing GC
const weakRef = new WeakRef(heavyObject)

// Later...
const obj = weakRef.deref()
if (obj) {
  // Object is still alive
  obj.doSomething()
} else {
  // Object was GC'd — recreate it
}

Production Memory Monitoring

// Alert before OOM, not after
const MEMORY_ALERT_MB = 512
const MEMORY_CRITICAL_MB = 768

setInterval(() => {
  const heapMB = process.memoryUsage().heapUsed / 1024 / 1024

  if (heapMB > MEMORY_CRITICAL_MB) {
    logger.alert(`CRITICAL: Heap at ${heapMB.toFixed(0)}MB — possible OOM imminent`)
    // Optional: trigger graceful restart
    process.emit('SIGTERM')
  } else if (heapMB > MEMORY_ALERT_MB) {
    logger.warn(`Memory warning: Heap at ${heapMB.toFixed(0)}MB`)
  }
}, 10_000)

Conclusion

Memory leaks in Node.js are almost always the same root causes: unbounded caches, unremoved event listeners, closures holding large objects, and uncleared intervals. The debugging process is: confirm the leak with heap monitoring, snapshot the heap to find growing objects, identify the retention path, and fix the root cause. Don't just add more RAM or restart on a schedule — that's treating symptoms. Find and fix the actual leak.