Cold Start Latency — Why Your Serverless Function Is Slow on First Request

Introduction

You move your API to AWS Lambda. Response times are fantastic — 50ms average. But every few minutes, one request takes 4 seconds, seemingly at random. Users report occasional slow loads. Your p99 latency is awful.

This is cold start latency — and it's the most important trade-off in serverless architecture.

What is a Cold Start?
What Makes Cold Starts Slow
1. Heavy imports
2. Synchronous initialization
3. Large deployment package
Fix 1: Lazy Loading
Fix 2: Reuse Connections Across Invocations
Fix 3: Reduce Bundle Size
Fix 4: Provisioned Concurrency
Fix 5: Scheduled Warmer (Free Alternative)
Fix 6: Lambda SnapStart (Java/Python)
Measure Your Cold Starts
Cold Start Optimization Checklist
Conclusion

What is a Cold Start?

A serverless function doesn't run on a persistent server. When invoked, the cloud provider must:

Provision a container — download runtime, set up environment
Initialize your code — execute module-level code, import statements
Run your handler — the actual function logic

Steps 1-2 only happen on a cold start (new container). Once warm, only step 3 runs.

Cold start:
  Container provision:  500ms
  Node.js runtime init:  200ms
  Your imports (AWS SDK, Prisma, etc.): 800ms
  DB connection:  300ms
  Handler code:    50ms
  ─────────────────────
  Total:          1850ms 😱

Warm start:
  Handler code:    50ms
  Total:           50ms ✅

What Makes Cold Starts Slow

1. Heavy imports

// ❌ Importing everything at module level
import express from 'express'
import { PrismaClient } from '@prisma/client'
import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3'
import { SESClient, SendEmailCommand } from '@aws-sdk/client-ses'
import { Configuration, OpenAIApi } from 'openai'
import sharp from 'sharp'  // Native module — very slow to load!
import pdfkit from 'pdfkit'  // Another heavy module

// All of these run during cold start initialization

2. Synchronous initialization

// ❌ Heavy work at module level
const config = JSON.parse(fs.readFileSync('/etc/config.json', 'utf8'))
const certs = loadAllSSLCerts()
const schema = buildGraphQLSchema()  // Expensive!

export const handler = async (event) => {
  // By the time handler runs, 500ms already spent above
}

3. Large deployment package

Every MB of code adds to cold start time. Node modules are the usual suspect.

Fix 1: Lazy Loading

Load heavy modules only when actually needed:

// ❌ Always loaded — adds to every cold start
import { S3Client } from '@aws-sdk/client-s3'
import sharp from 'sharp'

export const handler = async (event) => {
  if (event.type === 'resize-image') {
    // Uses S3 and sharp
  }
  // But most requests don't need them!
}

// ✅ Lazy load — only loaded when needed
let s3Client: S3Client | null = null
let sharpLib: typeof import('sharp') | null = null

export const handler = async (event) => {
  if (event.type === 'resize-image') {
    // Load only on first image request
    if (!s3Client) {
      const { S3Client } = await import('@aws-sdk/client-s3')
      s3Client = new S3Client({})
    }
    if (!sharpLib) {
      sharpLib = (await import('sharp')).default
    }
    // Now use s3Client and sharpLib
  }
}

Fix 2: Reuse Connections Across Invocations

Database connections persist between warm invocations — initialize once, reuse:

// ✅ Connection initialized once per container, reused across warm invocations
let db: PrismaClient | null = null

function getDB(): PrismaClient {
  if (!db) {
    db = new PrismaClient({
      datasources: { db: { url: process.env.DATABASE_URL } }
    })
  }
  return db
}

export const handler = async (event) => {
  const db = getDB()  // Returns existing connection on warm starts
  const users = await db.user.findMany()
  return users
}

// ✅ HTTP client reused across invocations
let httpClient: ReturnType<typeof axios.create> | null = null

function getHttpClient() {
  if (!httpClient) {
    httpClient = axios.create({
      baseURL: 'https://api.example.com',
      timeout: 5000,
    })
  }
  return httpClient
}

Fix 3: Reduce Bundle Size

# Analyze your Lambda bundle size
npm install -g source-map-explorer
npx source-map-explorer dist/bundle.js

# Use esbuild for tree-shaking + minification
npx esbuild handler.ts \
  --bundle \
  --minify \
  --platform=node \
  --target=node20 \
  --external:@aws-sdk/* \  # AWS SDK is available in the Lambda runtime!
  --outfile=dist/handler.js

// ❌ Import entire AWS SDK
import AWS from 'aws-sdk'  // 40MB!

// ✅ Import only what you need (v3 modular SDK)
import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3'
// Much smaller bundle

// ✅ Mark AWS SDK as external if targeting Lambda runtime
// Lambda includes AWS SDK v3 natively — don't bundle it!

Fix 4: Provisioned Concurrency

Keep containers pre-warmed — eliminates cold starts entirely for critical functions:

# AWS SAM / Serverless Framework
MyFunction:
  Type: AWS::Serverless::Function
  Properties:
    ProvisionedConcurrencyConfig:
      ProvisionedConcurrentExecutions: 5  # 5 containers always warm

// CDK
const fn = new lambda.Function(this, 'MyFunction', {
  runtime: lambda.Runtime.NODEJS_20_X,
  handler: 'handler.main',
  code: lambda.Code.fromAsset('dist'),
})

const version = fn.currentVersion
const alias = new lambda.Alias(this, 'LiveAlias', {
  aliasName: 'live',
  version,
  provisionedConcurrentExecutions: 5,
})

Cost note: Provisioned concurrency charges for idle time. Use it only for latency-sensitive functions.

Fix 5: Scheduled Warmer (Free Alternative)

// Lambda warmer function — pings your function on a schedule
// CloudWatch Events: run every 5 minutes

export const warmer = async () => {
  const lambda = new LambdaClient({})

  await lambda.send(new InvokeCommand({
    FunctionName: 'my-api-function',
    InvocationType: 'Event',  // Async invoke
    Payload: JSON.stringify({ source: 'warmer' }),
  }))
}

// In your main handler, ignore warmer pings
export const handler = async (event) => {
  if (event.source === 'warmer') return { statusCode: 200 }

  // Normal handler logic
}

Fix 6: Lambda SnapStart (Java/Python)

For JVM-based Lambdas, SnapStart takes a snapshot of the initialized execution environment and restores it on cold starts — reducing cold start time from 10s to <1s.

For Node.js, optimize module initialization order:

// Initialize fast modules first
import { createServer } from 'http'    // Fast
import { Router } from 'express'       // Fast

// Initialize slow modules only when first needed
// (lazy load pattern from Fix 1)

Measure Your Cold Starts

// Log cold start vs warm start
const isFirstInvocation = (() => {
  let coldStart = true
  return () => {
    const result = coldStart
    coldStart = false
    return result
  }
})()

export const handler = async (event) => {
  const cold = isFirstInvocation()
  const start = Date.now()

  const result = await handleRequest(event)

  console.log(JSON.stringify({
    coldStart: cold,
    duration: Date.now() - start,
    memoryUsed: process.memoryUsage().heapUsed / 1024 / 1024,
  }))

  return result
}

Cold Start Optimization Checklist

✅ Lazy load heavy modules (sharp, PDF libs, OpenAI SDK)
✅ Reuse DB connections across warm invocations
✅ Use AWS SDK v3 + mark it external in bundle
✅ Tree-shake and minify with esbuild
✅ Set memory to at least 1GB (more memory = more CPU = faster init)
✅ Use Provisioned Concurrency for critical endpoints
✅ Avoid synchronous heavy work at module level

Conclusion

Cold start latency is the fundamental trade-off of serverless. It's not a bug — it's the cost of infinite scalability and zero idle cost. The good news: with lazy loading, connection reuse, bundle optimization, and provisioned concurrency, you can reduce cold starts from 3-4 seconds to under 200ms. For truly latency-sensitive endpoints, provisioned concurrency eliminates cold starts entirely at a modest cost.