- Published on
WebSockets at Scale in 2026 — Beyond Socket.io to Production-Grade Real-Time
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
Socket.io got you this far, but your server is melting. Memory bloats, sticky sessions break load balancers, and reconnections cause cache stampedes. At scale, you need lower-level control. Raw WebSockets with ws, Redis pub/sub for broadcasts, and stateful platforms like Cloudflare Durable Objects give you the performance and flexibility production demands.
This post reveals what Socket.io hides: the plumbing underneath. Understand it, and you control everything.
- Socket.io Limitations at Scale
- Raw WebSocket with ws Library
- Horizontal Scaling with Redis Pub/Sub
- Cloudflare Durable Objects for Stateful WebSocket
- Ably, Pusher, and Soketi as Managed Alternatives
- WebSocket Auth with JWT on Connection
- Room and Channel Management Pattern
- Reconnection with Exponential Backoff and Message Replay
- WebSocket Health Checks
- Rate Limiting WebSocket Messages
- Checklist
- Conclusion
Socket.io Limitations at Scale
Socket.io provides convenience: automatic reconnection, fallbacks to HTTP long-polling, built-in rooms, and namespaces. But convenience costs performance.
Memory overhead: Each connected client holds state in memory. With 10,000 concurrent clients, Socket.io instances bloat. You're storing user data, room membership, and event queues in RAM.
Sticky sessions: Load balancers must route a client to the same server on reconnect. If that server dies, clients lose all state. You need session replication (expensive) or a shared data layer.
Broadcast inefficiency: When one server broadcasts to 100,000 clients, it queries all in-memory state. Horizontal scaling doesn't help; you're still bottlenecked on one process's memory.
Socket.io shines for small deployments (<1000 concurrent). For millions, its abstraction becomes a burden.
Raw WebSocket with ws Library
The ws npm package is the thin wrapper between you and the WebSocket protocol. It's fast, lightweight, and puts you in control.
import WebSocket from 'ws';
import http from 'http';
const server = http.createServer();
const wss = new WebSocket.Server({ server });
wss.on('connection', (ws) => {
console.log('Client connected');
ws.on('message', (data) => {
console.log('Received:', data);
ws.send(JSON.stringify({ echo: data }));
});
ws.on('error', (error) => {
console.error('WebSocket error:', error);
});
ws.on('close', () => {
console.log('Client disconnected');
});
});
server.listen(3000);
That's it. No fancy features, no memory leaks, no sticky session complexity. Just events flowing.
For production, add graceful shutdown, error handling, and validation:
ws.on('message', (data) => {
try {
const message = JSON.parse(data.toString());
if (!message.type) {
ws.send(JSON.stringify({ error: 'Missing type' }));
return;
}
handleMessage(ws, message);
} catch (e) {
ws.send(JSON.stringify({ error: 'Invalid JSON' }));
}
});
Horizontal Scaling with Redis Pub/Sub
One server can handle ~10,000 concurrent WebSocket connections. When you need more, split clients across servers. But now your servers can't talk to each other.
Redis pub/sub solves this. When a client on Server A sends a message intended for a client on Server B, Server A publishes to a Redis channel. Server B subscribes, receives the message, and broadcasts to its client.
import redis from 'redis';
import WebSocket from 'ws';
const redisPub = redis.createClient();
const redisSub = redis.createClient();
const clientMap = new Map<string, WebSocket>();
await redisSub.subscribe('broadcast', (message) => {
const data = JSON.parse(message);
for (const [, ws] of clientMap) {
if (ws.readyState === WebSocket.OPEN) {
ws.send(JSON.stringify(data));
}
}
});
wss.on('connection', (ws) => {
const clientId = crypto.randomUUID();
clientMap.set(clientId, ws);
ws.on('message', async (data) => {
const message = JSON.parse(data.toString());
// Broadcast to all servers
await redisPub.publish('broadcast', JSON.stringify(message));
});
ws.on('close', () => {
clientMap.delete(clientId);
});
});
Now 10 servers handle 100,000 concurrent clients. Each server maintains its own connections; Redis coordinates broadcasts. No sticky sessions needed.
Cloudflare Durable Objects for Stateful WebSocket
Cloudflare Durable Objects are single-threaded, persistent objects that live at the edge. They're perfect for WebSocket coordination when you need strong consistency or state that survives restarts.
Use them as coordinators: when a client connects, a Durable Object routes messages and manages room state.
export class RoomCoordinator implements DurableObject {
private clients = new Map<string, WebSocket>();
private messages: any[] = [];
async fetch(request: Request): Promise<Response> {
if (request.headers.get('Upgrade') === 'websocket') {
const pair = new WebSocketPair();
const [client, server] = Object.values(pair);
const clientId = crypto.randomUUID();
this.clients.set(clientId, server as any);
server.addEventListener('message', (event) => {
this.messages.push(JSON.parse(event.data));
for (const [id, ws] of this.clients) {
if (id !== clientId) {
(ws as any).send(JSON.stringify(event.data));
}
}
});
server.addEventListener('close', () => {
this.clients.delete(clientId);
});
return new Response(null, { webSocket: client as any });
}
return new Response('Not a WebSocket');
}
}
Durable Objects scale differently: you pay per-object, not per-connection. They're ideal for chat rooms, collaborative editing, or gaming lobbies where strong consistency matters.
Ably, Pusher, and Soketi as Managed Alternatives
Self-hosting WebSocket infrastructure costs engineering time. Managed platforms handle scaling, persistence, and reliability:
Pusher: Fully managed, global edge network. Pay per message. Best for developers who want it to just work.
Ably: Similar to Pusher with stronger guarantees (delivery guarantees, ordering, message history). Higher cost but better SLAs.
Soketi: Open-source Socket.io compatible server. Self-host or use their managed service. Cheaper than Pusher, less feature-rich.
Pick managed services when your time is more valuable than the cost. Pick self-hosted when you need deep customization or cost optimization.
WebSocket Auth with JWT on Connection
When a client connects, validate a JWT token before accepting the connection.
wss.on('connection', (ws, request) => {
const url = new URL(request.url || '', 'http://localhost');
const token = url.searchParams.get('token');
if (!token) {
ws.close(1008, 'Missing auth token');
return;
}
let decoded;
try {
decoded = jwt.verify(token, process.env.JWT_SECRET);
} catch (e) {
ws.close(1008, 'Invalid token');
return;
}
const userId = (decoded as any).sub;
ws.userId = userId;
console.log(`User ${userId} connected`);
});
The client passes the token in the query string:
const token = await getAuthToken();
const ws = new WebSocket(`ws://localhost:3000?token=${token}`);
Tokens expire. Implement token refresh before expiry by closing the connection and reconnecting.
Room and Channel Management Pattern
Not all messages go to all clients. Implement rooms (groups of clients) and channels (topics).
type Room = Map<string, WebSocket>;
const rooms = new Map<string, Room>();
function joinRoom(roomId: string, clientId: string, ws: WebSocket) {
if (!rooms.has(roomId)) {
rooms.set(roomId, new Map());
}
rooms.get(roomId)!.set(clientId, ws);
}
function leaveRoom(roomId: string, clientId: string) {
rooms.get(roomId)?.delete(clientId);
if (rooms.get(roomId)?.size === 0) {
rooms.delete(roomId);
}
}
function broadcastToRoom(roomId: string, message: any) {
const room = rooms.get(roomId);
if (!room) return;
for (const ws of room.values()) {
if (ws.readyState === WebSocket.OPEN) {
ws.send(JSON.stringify(message));
}
}
}
ws.on('message', (data) => {
const message = JSON.parse(data.toString());
if (message.action === 'join') {
joinRoom(message.roomId, clientId, ws);
broadcastToRoom(message.roomId, {
type: 'user-joined',
userId: ws.userId
});
}
});
Reconnection with Exponential Backoff and Message Replay
WebSocket connections drop. Clients must reconnect without losing messages.
On the client:
class ReconnectingWebSocket {
private ws: WebSocket | null = null;
private backoff = 1000;
private maxBackoff = 30000;
private queue: any[] = [];
connect(url: string) {
this.ws = new WebSocket(url);
this.ws.onopen = () => {
console.log('Connected');
this.backoff = 1000;
this.drainQueue();
};
this.ws.onclose = () => {
console.log('Disconnected, reconnecting in', this.backoff);
setTimeout(() => this.connect(url), this.backoff);
this.backoff = Math.min(this.backoff * 2, this.maxBackoff);
};
this.ws.onmessage = (event) => {
this.onMessage?.(JSON.parse(event.data));
};
}
send(message: any) {
if (this.ws?.readyState === WebSocket.OPEN) {
this.ws.send(JSON.stringify(message));
} else {
this.queue.push(message);
}
}
private drainQueue() {
while (this.queue.length > 0 && this.ws?.readyState === WebSocket.OPEN) {
const message = this.queue.shift();
this.ws.send(JSON.stringify(message));
}
}
onMessage?: (message: any) => void;
}
On the server, track client sequence numbers. When a client reconnects, replay messages they missed.
WebSocket Health Checks
Idle connections timeout. Send periodic pings to keep them alive:
const pingInterval = setInterval(() => {
wss.clients.forEach((ws) => {
ws.ping();
});
}, 30000);
server.on('close', () => {
clearInterval(pingInterval);
});
Clients automatically respond to pings. If a connection is dead, the send fails and you can clean it up.
Rate Limiting WebSocket Messages
Clients can spam messages. Rate-limit per user:
const rateLimits = new Map<string, { count: number; reset: number }>();
function isRateLimited(userId: string): boolean {
const limit = rateLimits.get(userId);
const now = Date.now();
if (!limit || now > limit.reset) {
rateLimits.set(userId, { count: 1, reset: now + 1000 });
return false;
}
if (limit.count >= 100) {
return true;
}
limit.count++;
return false;
}
ws.on('message', (data) => {
if (isRateLimited(ws.userId)) {
ws.send(JSON.stringify({ error: 'Rate limited' }));
return;
}
// Process message
});
Checklist
- Benchmark Socket.io vs raw
wson your scale - Design your message schema and validation
- Implement JWT auth on WebSocket upgrade
- Set up Redis pub/sub for horizontal scaling
- Build room/channel management
- Add exponential backoff reconnection on client
- Implement ping/pong health checks
- Add rate limiting per user
- Monitor connection churn and latency
- Load test at your target scale
Conclusion
WebSockets at scale demand low-level control. Raw ws gives it to you. Redis pub/sub distributes traffic. Cloudflare Durable Objects provide stateful coordination. Managed platforms like Ably handle the hard parts if cost justifies it.
Start with ws + Redis pub/sub for most teams. Evaluate Durable Objects when you need strong consistency. Only adopt managed platforms when you've outgrown what you can reasonably maintain. The WebSocket landscape in 2026 offers options; choose based on your scale, budget, and engineering capacity.