- Published on
ClickHouse for Backend Engineers — Real-Time Analytics Without the Data Warehouse Complexity
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
ClickHouse is a columnar OLAP database that handles billions of rows with sub-second query latency. Unlike Postgres, optimized for OLTP row access, ClickHouse is built for analytics: aggregations, time-series, and complex range queries. This guide covers production ClickHouse architectures, from MergeTree engines to dual-write patterns that keep your OLTP and analytics layers in sync.
- When ClickHouse Beats PostgreSQL for Analytics
- MergeTree Engine: The Foundation of ClickHouse Performance
- Materialized Views: Pre-Aggregation at Insert Time
- ReplacingMergeTree for Deduplication
- Partitioning Strategy: Date-Based Partitioning
- Dual-Write Pattern: Postgres + ClickHouse
- Query Optimization: PREWHERE vs WHERE
- Node.js ClickHouse Client Integration
- Checklist
- Conclusion
When ClickHouse Beats PostgreSQL for Analytics
ClickHouse outperforms Postgres on analytical workloads by 10-100x. Understand when the switch makes sense.
-- Query: Aggregations over billions of rows (ClickHouse's sweet spot)
-- Postgres: ~60 seconds for 1 billion rows
-- ClickHouse: ~100ms for 1 billion rows
-- Example: Sum events by hour (ClickHouse)
SELECT
toStartOfHour(event_time) as hour,
count() as event_count,
sum(value) as total_value,
avg(value) as avg_value,
quantile(0.95)(latency_ms) as p95_latency
FROM events
WHERE event_time >= NOW() - INTERVAL 30 DAY
GROUP BY hour
ORDER BY hour DESC;
-- Column-oriented compression: 100x better than row storage
-- Before compression (Postgres row-store): 100GB
-- After ClickHouse (column-oriented + LZ4): 1GB
-- Decision matrix:
-- Use PostgreSQL if: OLTP, complex joins, ACID transactions, <1M rows/sec insert
-- Use ClickHouse if: Analytics, time-series, immutable events, >1M rows/sec insert
-- Real-world schema sizes (1 billion event rows):
-- Postgres: 150GB uncompressed
-- ClickHouse: 10GB compressed (15x savings)
MergeTree Engine: The Foundation of ClickHouse Performance
MergeTree is ClickHouse's core table engine. All production tables use MergeTree variants.
-- Basic MergeTree: immutable data, no deletions
CREATE TABLE events (
event_id UUID,
user_id UInt64,
event_type String,
event_time DateTime,
properties JSON,
value Float32
)
ENGINE = MergeTree()
ORDER BY (event_type, user_id, event_time)
PRIMARY KEY (event_type)
PARTITION BY toYYYYMM(event_time)
SETTINGS index_granularity = 8192;
-- ORDER BY determines primary sort (used for compression and queries)
-- PRIMARY KEY is subset of ORDER BY (optional, speeds up WHERE clauses)
-- PARTITION BY splits data into separate parts (date-based is common)
-- index_granularity = rows per index mark (8192 = default, good for most cases)
-- Best practice: ORDER BY (tenant_id, timestamp)
-- Rationale: partition elimination by tenant + time-based compression
-- Monthly partitions (balance between part count and query speed)
CREATE TABLE metrics (
metric_name String,
timestamp DateTime,
value Float64,
tags Map(String, String)
)
ENGINE = MergeTree()
ORDER BY (metric_name, timestamp)
PARTITION BY toYYYYMM(timestamp)
TTL timestamp + INTERVAL 1 YEAR;
-- TTL: Time-To-Live (auto-delete old data)
-- TTL timestamp + INTERVAL 90 DAY: keep 90 days of data
-- ReplacingMergeTree: handles updates/deletes (expensive, use sparingly)
CREATE TABLE user_snapshots (
user_id UInt64,
version UInt64,
snapshot_time DateTime,
profile_data String
)
ENGINE = ReplacingMergeTree(version)
ORDER BY (user_id, snapshot_time)
PARTITION BY toYYYYMM(snapshot_time);
-- SummingMergeTree: aggregates on read (good for pre-aggregates)
CREATE TABLE hourly_stats (
metric_name String,
hour_time DateTime,
event_count UInt64,
total_value Float64
)
ENGINE = SummingMergeTree((event_count, total_value))
ORDER BY (metric_name, hour_time);
Materialized Views: Pre-Aggregation at Insert Time
Materialized views automatically aggregate data as it arrives, enabling instant queries.
-- Raw events table (high volume)
CREATE TABLE events_raw (
event_id UUID,
user_id UInt64,
event_type String,
event_time DateTime,
value Float32
)
ENGINE = MergeTree()
ORDER BY (event_type, event_time)
PARTITION BY toYYYYMM(event_time);
-- Aggregated table (target for materialized view)
CREATE TABLE events_hourly (
event_type String,
event_hour DateTime,
event_count UInt64,
total_value SimpleAggregateFunction(sum, Float32),
avg_value SimpleAggregateFunction(avg, Float32)
)
ENGINE = SummingMergeTree((event_count, total_value))
ORDER BY (event_type, event_hour)
PARTITION BY toYYYYMM(event_hour);
-- Materialized view: auto-aggregates on insert
CREATE MATERIALIZED VIEW events_hourly_mv
TO events_hourly
AS SELECT
event_type,
toStartOfHour(event_time) as event_hour,
count() as event_count,
sum(value) as total_value,
avg(value) as avg_value
FROM events_raw
GROUP BY event_type, event_hour;
-- Insert raw events (automatically aggregated)
INSERT INTO events_raw VALUES
('uuid1', 123, 'click', NOW(), 1.0),
('uuid2', 124, 'view', NOW(), 2.0);
-- Query pre-aggregated data (instant)
SELECT * FROM events_hourly ORDER BY event_hour DESC LIMIT 100;
-- Multi-level aggregation (hourly → daily)
CREATE MATERIALIZED VIEW events_daily_mv
TO events_daily
AS SELECT
event_type,
toStartOfDay(event_hour) as event_day,
sum(event_count) as event_count,
sum(total_value) as total_value
FROM events_hourly
GROUP BY event_type, event_day;
ReplacingMergeTree for Deduplication
Real-world data often has duplicates from retries or replication. ReplacingMergeTree handles this.
-- Events with potential duplicates (idempotent key)
CREATE TABLE events_dedup (
event_id UUID,
user_id UInt64,
event_type String,
event_time DateTime,
version UInt64, -- version number for deduplication
value Float32
)
ENGINE = ReplacingMergeTree(version)
ORDER BY (event_id, version)
PARTITION BY toYYYYMM(event_time);
-- Insert duplicate event (different version)
INSERT INTO events_dedup VALUES
(toUUID('12345678-1234-5678-1234-567812345678'), 100, 'purchase', NOW(), 1, 99.99),
(toUUID('12345678-1234-5678-1234-567812345678'), 100, 'purchase', NOW(), 2, 99.99);
-- Final modifier: deduplicates on read
SELECT *
FROM events_dedup
FINAL -- ensures latest version returned
WHERE user_id = 100;
-- Non-FINAL query shows both versions (for debugging)
SELECT * FROM events_dedup WHERE user_id = 100;
-- Best practice: application assigns version = insert timestamp
-- Higher timestamp = latest version
ALTER TABLE events_dedup
ADD COLUMN insert_ts UInt64;
-- Upsert pattern (update version on conflict)
INSERT INTO events_dedup VALUES
(toUUID('87654321-4321-8765-4321-876543218765'), 101, 'login', NOW(), 1000000000, 0.0);
Partitioning Strategy: Date-Based Partitioning
Partitions are key for managing retention and query performance.
-- Daily partitions (granular, good for retention policies)
CREATE TABLE logs_daily (
timestamp DateTime,
level String,
message String,
service String
)
ENGINE = MergeTree()
ORDER BY (timestamp, service)
PARTITION BY toDate(timestamp) -- new partition each day
TTL timestamp + INTERVAL 30 DAY; -- auto-delete old partitions
-- Monthly partitions (fewer partitions, less overhead)
CREATE TABLE billing_events (
billing_period String,
customer_id UInt64,
amount Float64,
timestamp DateTime
)
ENGINE = MergeTree()
ORDER BY (customer_id, timestamp)
PARTITION BY toYYYYMM(timestamp);
-- Query partition metadata
SELECT
table,
partition,
partition_key,
name,
rows
FROM system.parts
WHERE table = 'events_raw'
ORDER BY partition DESC;
-- Drop old partition (instant, no scan)
ALTER TABLE events_raw DROP PARTITION '202512';
-- Move partition to another disk
ALTER TABLE events_raw MOVE PARTITION '202601' TO DISK 'archive';
-- Part count monitoring (too many parts = slow queries)
SELECT
table,
count() as part_count,
sum(rows) as total_rows
FROM system.parts
WHERE active = 1
GROUP BY table
ORDER BY part_count DESC;
Dual-Write Pattern: Postgres + ClickHouse
Keep Postgres and ClickHouse in sync without a dedicated CDC tool.
// Node.js: dual-write to Postgres and ClickHouse
import pg from 'pg';
import { ClickHouse } from 'clickhouse';
const pgPool = new pg.Pool({
host: 'postgres-host',
database: 'myapp',
});
const clickhouse = new ClickHouse({
host: 'clickhouse-host',
port: 8123,
database: 'analytics',
basicAuth: { username: 'default', password: 'password' },
});
interface Event {
eventId: string;
userId: number;
eventType: string;
value: number;
timestamp: Date;
}
async function logEvent(event: Event) {
try {
// Phase 1: Write to Postgres (primary)
await pgPool.query(
`INSERT INTO events (id, user_id, type, value, created_at)
VALUES ($1, $2, $3, $4, $5)`,
[event.eventId, event.userId, event.eventType, event.value, event.timestamp]
);
// Phase 2: Write to ClickHouse (async, can fail)
try {
await clickhouse.insert({
table: 'events_raw',
values: [event],
format: 'JSONEachRow',
}).catch(err => {
// Log to dead letter queue, retry later
console.error('ClickHouse insert failed:', err);
// Queue for retry
});
} catch (chErr) {
console.error('ClickHouse write failed, queued for retry:', chErr.message);
// Store in retry table for async processor
await pgPool.query(
`INSERT INTO clickhouse_retry_queue (event_data, retry_count)
VALUES ($1, 0)`,
[JSON.stringify(event)]
);
}
return { success: true, eventId: event.eventId };
} catch (pgErr) {
// Postgres failure is fatal
console.error('Postgres write failed:', pgErr);
throw pgErr;
}
}
// Async retry processor (background job)
async function processCHRetries() {
const retries = await pgPool.query(
`SELECT * FROM clickhouse_retry_queue
WHERE retry_count < 5 AND created_at > NOW() - INTERVAL 24 HOUR
LIMIT 100`
);
for (const row of retries.rows) {
try {
const event = JSON.parse(row.event_data);
await clickhouse.insert({
table: 'events_raw',
values: [event],
format: 'JSONEachRow',
});
// Mark as complete
await pgPool.query(
`DELETE FROM clickhouse_retry_queue WHERE id = $1`,
[row.id]
);
} catch (err) {
// Increment retry count
await pgPool.query(
`UPDATE clickhouse_retry_queue SET retry_count = retry_count + 1
WHERE id = $1`,
[row.id]
);
}
}
}
Query Optimization: PREWHERE vs WHERE
ClickHouse's PREWHERE clause filters before reading other columns, saving I/O.
-- SLOW: WHERE filters after reading all columns
SELECT *
FROM events
WHERE event_type = 'purchase'
LIMIT 100;
-- FAST: PREWHERE filters before reading other columns
SELECT *
FROM events
PREWHERE event_type = 'purchase'
LIMIT 100;
-- ClickHouse automatically optimizes simple WHERE conditions
-- But explicit PREWHERE is useful for complex queries
-- Benchmark: event_type is in ORDER BY, so filtering is fast
SELECT
event_type,
count() as cnt,
avg(value) as avg_val
FROM events
PREWHERE event_type IN ('purchase', 'refund') -- filter early
WHERE event_time > NOW() - INTERVAL 7 DAY
GROUP BY event_type;
-- Aggregation query optimization
SELECT
user_id,
count() as purchases,
sum(value) as total_spend,
quantile(0.95)(value) as p95_order_value
FROM events
PREWHERE event_type = 'purchase' AND event_time > NOW() - INTERVAL 90 DAY
GROUP BY user_id
HAVING purchases > 10
ORDER BY total_spend DESC
LIMIT 1000;
-- Array functions (common in log analytics)
SELECT
level,
count() as log_count,
countIf(arrayExists(x -> x = 'error', tags)) as error_count
FROM logs
PREWHERE level IN ('ERROR', 'CRITICAL')
GROUP BY level;
Node.js ClickHouse Client Integration
Practical patterns for application integration.
import { ClickHouse } from 'clickhouse';
const ch = new ClickHouse({
host: 'clickhouse.example.com',
port: 8123,
database: 'analytics',
basicAuth: { username: 'user', password: 'pass' },
});
// Batch insert for throughput
async function batchInsertEvents(events: Event[]) {
if (events.length === 0) return;
await ch.insert({
table: 'events_raw',
values: events,
format: 'JSONEachRow',
});
}
// Query with aggregation
async function getUserMetrics(userId: number, days: number = 7) {
const result = await ch.query(
`SELECT
user_id,
count() as event_count,
sum(value) as total_value,
avg(value) as avg_value,
quantile(0.95)(value) as p95_value,
uniqExact(event_type) as event_types
FROM events
WHERE user_id = ${userId}
AND event_time > now() - interval ${days} day
GROUP BY user_id`
).toPromise();
return result[0];
}
// Stream large result set
async function streamEventsByType(type: string) {
const stream = ch.query(
`SELECT * FROM events WHERE event_type = '${type}'`
).stream();
return new Promise((resolve, reject) => {
stream
.on('data', (chunk) => {
// Process chunk (100-1000 rows at a time)
processEvents(JSON.parse(chunk));
})
.on('end', () => resolve(null))
.on('error', reject);
});
}
// Execute raw query with parameters (parameterized queries)
async function queryWithParams(userId: number, days: number) {
const query = `
SELECT event_id, value FROM events
WHERE user_id = {user_id:UInt64}
AND event_time > now() - interval {days:UInt32} day
`;
const result = await ch.query(query)
.param('user_id', userId)
.param('days', days)
.toPromise();
return result;
}
// Connection pool for high throughput
class CHConnection {
private clients: ClickHouse[] = [];
constructor(poolSize: number = 10) {
for (let i = 0; i < poolSize; i++) {
this.clients.push(new ClickHouse({
host: 'clickhouse.example.com',
database: 'analytics',
}));
}
}
async insert(table: string, values: any[]) {
const client = this.clients[Math.floor(Math.random() * this.clients.length)];
return client.insert({ table, values, format: 'JSONEachRow' });
}
async query(sql: string) {
const client = this.clients[Math.floor(Math.random() * this.clients.length)];
return client.query(sql).toPromise();
}
}
Checklist
- MergeTree engine chosen and ORDER BY optimized for query patterns
- Partitioning strategy implemented (date-based partitions)
- TTL configured for automatic old data deletion
- Materialized views created for common aggregations
- PREWHERE clauses used in analytical queries
- Dual-write pattern implemented (Postgres primary, CH async)
- ClickHouse retry queue for failed writes
- Connection pooling configured for Node.js client
- Query performance validated with sample 1M+ row datasets
- Part count monitoring and merge settings configured
- Backup strategy documented (partition export)
- Replica count and replication lag monitored
Conclusion
ClickHouse transforms analytics from a separate slow system into a fast, queryable data store integrated with your primary database. Use MergeTree with proper ordering and partitioning, implement materialized views for pre-aggregation, and maintain dual-write consistency through a retry queue. With proper configuration, you'll handle billions of rows with sub-second query latency while keeping analytics in sync with your operational data.