- Published on
DataLoader and the N+1 Problem — Batching Database Queries in Node.js
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
The N+1 query problem is one of the most common performance issues in Node.js applications. When fetching a user and their posts, you might execute one query for the user, then one query per post—turning a simple operation into hundreds of queries. DataLoader solves this elegantly through intelligent batching and caching within a single request lifecycle.
- Understanding the N+1 Problem
- Per-Request DataLoader Instances
- Implementing Batch Functions for Prisma
- Cache Behavior and Deduplication
- Custom Batch Key Normalization
- Priming the Cache
- Composing DataLoaders for Nested Relationships
- Checklist
- Conclusion
Understanding the N+1 Problem
The N+1 problem manifests differently in REST and GraphQL contexts, but the root cause is identical: missing or inefficient batching.
// PROBLEMATIC: N+1 queries
async function getUserWithPosts(userId: string) {
// Query 1: Fetch user
const user = await db.user.findUnique({ where: { id: userId } });
// Query N: Fetch each post individually
const posts = await Promise.all(
user.postIds.map(id => db.post.findUnique({ where: { id } }))
);
return { ...user, posts };
}
// For 1 user with 100 posts: 101 database round-trips
With DataLoader, we batch these queries intelligently:
import DataLoader from 'dataloader';
import { db } from './db';
type BatchLoadFn<K, V> = (keys: K[]) => Promise<(V | Error)[]>;
const postLoader = new DataLoader<string, Post>(
async (postIds: string[]) => {
// Single query for ALL posts, regardless of input size
const posts = await db.post.findMany({
where: { id: { in: postIds } },
});
// Return results in same order as input keys
return postIds.map(
id => posts.find(p => p.id === id) || new Error(`Post ${id} not found`)
);
},
{ cache: true } // Enable within-request deduplication
);
async function getUserWithPosts(userId: string) {
const user = await db.user.findUnique({ where: { id: userId } });
// Still looks like individual loads, but batches automatically
const posts = await Promise.all(
user.postIds.map(id => postLoader.load(id))
);
return { ...user, posts };
}
Per-Request DataLoader Instances
The critical mistake: creating DataLoader as a singleton. Caching must be scoped to a single request.
// ❌ WRONG: Singleton persists cache across requests
export const globalPostLoader = new DataLoader(/* ... */);
// ✓ CORRECT: Fresh instance per request
async function handleRequest(req: Request) {
// Create in middleware or request handler
const dataLoaders = {
post: new DataLoader(async (ids: string[]) => {
return db.post.findMany({ where: { id: { in: ids } } });
}),
author: new DataLoader(async (ids: string[]) => {
return db.user.findMany({ where: { id: { in: ids } } });
}),
};
// Pass to resolvers via context
const result = await resolver(args, { dataloaders: dataLoaders });
return result;
}
Implementing Batch Functions for Prisma
Prisma and DataLoader work beautifully together. Here's a production pattern:
interface DataloaderContext {
user: DataLoader<string, User>;
post: DataLoader<string, Post>;
comment: DataLoader<string, Comment>;
}
function createDataloaders(): DataloaderContext {
return {
user: new DataLoader(
async (userIds: string[]) => {
const users = await db.user.findMany({
where: { id: { in: userIds } },
});
// Critical: maintain input order
return userIds.map(id => users.find(u => u.id === id)!);
},
{ cache: true }
),
post: new DataLoader(
async (postIds: string[]) => {
const posts = await db.post.findMany({
where: { id: { in: postIds } },
include: { author: true, tags: true },
});
return postIds.map(id => posts.find(p => p.id === id)!);
},
{ cache: true }
),
comment: new DataLoader(
async (commentIds: string[]) => {
return db.comment.findMany({
where: { id: { in: commentIds } },
});
},
{ cache: true }
),
};
}
Cache Behavior and Deduplication
DataLoader's cache is request-scoped. The cache: true option enables within-request deduplication:
const loader = new DataLoader(
async (ids: string[]) => {
console.log(`Batch loading: ${ids.join(', ')}`);
return db.item.findMany({ where: { id: { in: ids } } });
},
{ cache: true }
);
// Same request, two parts of code:
await loader.load('user-1'); // Batches: 'user-1'
await loader.load('user-2'); // Batches: 'user-1', 'user-2'
await loader.load('user-1'); // Cache hit, no new batch
// Output: "Batch loading: user-1, user-2" (single batch)
With cache: false, each load triggers a batch function call.
Custom Batch Key Normalization
When batch keys need transformation or normalization:
type BatchLoadFn<K, V> = (keys: K[]) => Promise<(V | Error)[]>;
const userByEmailLoader = new DataLoader<string, User | null>(
async (emails: string[]) => {
// Normalize emails before querying
const normalizedEmails = emails.map(e => e.toLowerCase().trim());
const users = await db.user.findMany({
where: { email: { in: normalizedEmails } },
});
// Return in input order
return emails.map(email => {
const normalized = email.toLowerCase().trim();
return users.find(u => u.email === normalized) || null;
});
},
{ cache: true }
);
// Works despite case differences
await userByEmailLoader.load('User@Example.com');
await userByEmailLoader.load('user@example.com'); // Cache hit
Priming the Cache
Pre-populate the cache with known data to avoid redundant queries:
function createUserLoader(context: RequestContext): DataLoader<string, User> {
const loader = new DataLoader(
async (userIds: string[]) => {
return db.user.findMany({ where: { id: { in: userIds } } });
},
{ cache: true }
);
// Prime with data from previous query
if (context.currentUser) {
loader.prime(context.currentUser.id, context.currentUser);
}
// Prime with preloaded batch
context.preloadedUsers?.forEach(user => {
loader.prime(user.id, user);
});
return loader;
}
Composing DataLoaders for Nested Relationships
Complex nested relationships benefit from composition:
class DataloaderFactory {
private userLoader: DataLoader<string, User>;
private postLoader: DataLoader<string, Post>;
private commentsByPostLoader: DataLoader<string, Comment[]>;
constructor() {
this.userLoader = new DataLoader(async (ids: string[]) => {
return db.user.findMany({ where: { id: { in: ids } } });
}, { cache: true });
this.postLoader = new DataLoader(async (ids: string[]) => {
return db.post.findMany({
where: { id: { in: ids } },
});
}, { cache: true });
this.commentsByPostLoader = new DataLoader(async (postIds: string[]) => {
const comments = await db.comment.findMany({
where: { postId: { in: postIds } },
});
return postIds.map(id =>
comments.filter(c => c.postId === id)
);
}, { cache: true });
}
async resolvePost(id: string) {
const post = await this.postLoader.load(id);
const author = await this.userLoader.load(post.authorId);
const comments = await this.commentsByPostLoader.load(id);
return { ...post, author, comments };
}
}
Checklist
- Identify N+1 query patterns in your resolver/handler code
- Create per-request DataLoader instances, not singletons
- Verify batch function returns results in input key order
- Enable
cache: truefor request-scoped deduplication - Use
.prime()to avoid redundant queries when data is already known - Monitor batch size with logging to ensure batching is working
- Consider timeout for batch execution if your system has strict latency requirements
- Test that concurrent requests have isolated caches
- Document which loaders are available in your context object
Conclusion
DataLoader transforms the N+1 problem from a performance catastrophe into automatic optimization. By batching queries and maintaining request-scoped caches, you achieve efficient database access without complicating your resolver code. The key is understanding that DataLoader instances must be fresh per request and that proper key ordering in batch functions is non-negotiable for correctness.