- Published on
Elasticsearch in Production — Index Design, Search Relevance, and Operational Gotchas
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
Elasticsearch powers search at massive scale, but production deployments require careful index design and operational discipline. Misconfigured mappings cause relevance problems, over-sharding leads to hot shards and rejections, and naive pagination breaks under load. This guide covers production patterns: designing analyzers for quality search results, using aliases for zero-downtime reindexing, and monitoring shards to prevent cascade failures.
- Index Mapping Design: Keyword vs Text, Nested vs Object
- Analyzer Customization for Search Quality
- Multi-Match with Field Boosting
- Pagination: from/size vs search_after vs scroll
- Index Aliases for Zero-Downtime Reindexing
- Shard Sizing and Over-Sharding Consequences
- Monitoring: Hot Threads, Rejected Queues, Node Pressure
- Checklist
- Conclusion
Index Mapping Design: Keyword vs Text, Nested vs Object
Mapping decisions affect both search relevance and query performance. Choose field types carefully.
{
"mappings": {
"properties": {
"title": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
},
"analyzer": "standard"
},
"content": {
"type": "text",
"analyzer": "english"
},
"tags": {
"type": "keyword"
},
"published_date": {
"type": "date",
"format": "strict_date_optional_time"
},
"author": {
"type": "nested",
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"email": {
"type": "keyword"
}
}
},
"metadata": {
"type": "object",
"enabled": false
},
"location": {
"type": "geo_point"
}
}
}
}
Field Type Guidance:
- text: Analyzed, full-text search (searchable but not aggregatable)
- keyword: Not analyzed, exact match, aggregatable (for facets, filters)
- nested: Array of objects with independent queries (expensive)
- object: Array of objects without independence (memory efficient)
- geo_point: Geographic searches (distance filters)
// PUT /articles
// Create index with custom mapping
const mapping = {
settings: {
number_of_shards: 3,
number_of_replicas: 1,
refresh_interval: '30s', // balance between freshness and performance
},
mappings: {
properties: {
id: { type: 'keyword' },
title: {
type: 'text',
analyzer: 'standard',
fields: {
keyword: { type: 'keyword' },
autocomplete: { type: 'text', analyzer: 'autocomplete' }
}
},
description: { type: 'text', analyzer: 'english' },
content: { type: 'text', analyzer: 'english' },
category: { type: 'keyword' },
tags: { type: 'keyword' },
author_id: { type: 'keyword' },
published_at: { type: 'date' },
view_count: { type: 'long' },
rating: { type: 'float' },
comments: {
type: 'nested',
properties: {
author: { type: 'keyword' },
text: { type: 'text' },
timestamp: { type: 'date' }
}
}
}
}
};
Analyzer Customization for Search Quality
Analyzers tokenize and normalize text. Custom analyzers improve search relevance.
{
"settings": {
"analysis": {
"analyzers": {
"custom_english": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"english_stop",
"english_stemmer"
]
},
"autocomplete": {
"type": "custom",
"tokenizer": "autocomplete_tokenizer",
"filter": [
"lowercase"
]
},
"shingle_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"shingle"
]
}
},
"tokenizers": {
"autocomplete_tokenizer": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 10,
"token_chars": ["letter", "digit"]
}
},
"filters": {
"english_stop": {
"type": "stop",
"stopwords": "_english_"
},
"english_stemmer": {
"type": "stemmer",
"language": "english"
},
"shingle": {
"type": "shingle",
"max_shingle_size": 2
}
}
}
}
}
// Query with multiple analyzers
const query = {
query: {
bool: {
must: [
{
multi_match: {
query: 'machine learning',
fields: [
'title^3', // boost title 3x
'content^2', // boost content 2x
'tags'
],
type: 'best_fields' // use best field score
}
}
],
filter: [
{
range: {
published_at: {
gte: 'now-30d',
lte: 'now'
}
}
}
]
}
}
};
Multi-Match with Field Boosting
Multi-match queries search across fields with custom relevance weights.
{
"query": {
"multi_match": {
"query": "backend performance",
"fields": [
"title^4",
"description^3",
"content^2",
"tags",
"category"
],
"type": "best_fields",
"operator": "and",
"fuzziness": "AUTO",
"prefix_length": 0,
"tie_breaker": 0.3
}
}
}
multi_match types:
- best_fields: Use highest-scoring field (default, good for exact matches)
- most_fields: Combine scores from multiple fields (good for synonyms)
- cross_fields: Match across fields (good for author + content)
- phrase: Phrase query across fields (exact phrase matching)
// Real-world multi-match: e-commerce search
const productSearch = {
query: {
bool: {
must: {
multi_match: {
query: 'waterproof hiking boots',
fields: [
'name^4', // exact product name most relevant
'brand^3',
'category^2',
'description',
'tags'
],
type: 'best_fields',
operator: 'and', // all terms must match
minimum_should_match: 2 // at least 2 terms
}
},
filter: [
{
range: { price: { gte: 50, lte: 300 } }
},
{
term: { in_stock: true }
},
{
range: { rating: { gte: 4.0 } }
}
]
}
},
sort: [
{ relevance: { order: 'desc' } },
{ rating: { order: 'desc' } },
{ price: { order: 'asc' } }
],
size: 20
};
Pagination: from/size vs search_after vs scroll
Different pagination strategies have different tradeoffs.
// PAGINATION STRATEGY 1: from/size (simple, inefficient for deep pages)
async function paginateWithFromSize(query: any, page: number, pageSize: number = 20) {
const from = (page - 1) * pageSize;
// INEFFICIENT: Elasticsearch skips [0, from) documents, defeats index
const result = await client.search({
index: 'articles',
body: {
query,
from,
size: pageSize,
sort: [{ published_at: { order: 'desc' } }]
}
});
return {
total: result.body.hits.total.value,
hits: result.body.hits.hits,
page,
pageSize,
hasMore: from + pageSize < result.body.hits.total.value
};
}
// Problem: from/size beyond 10,000 causes memory issues
// Elasticsearch loads all matching documents into memory for sorting/ranking
// PAGINATION STRATEGY 2: search_after (efficient, keyset pagination)
async function paginateWithSearchAfter(
query: any,
after?: any[],
pageSize: number = 20
) {
const body: any = {
query,
size: pageSize,
sort: [
{ published_at: { order: 'desc' } },
{ id: { order: 'asc' } } // tiebreaker for consistent sorting
]
};
if (after) {
body.search_after = after;
}
const result = await client.search({
index: 'articles',
body
});
const hits = result.body.hits.hits;
const nextAfter = hits.length > 0 ? hits[hits.length - 1].sort : null;
return {
hits,
nextAfter,
hasMore: hits.length === pageSize
};
}
// Usage:
let after = null;
for (let i = 0; i < 5; i++) {
const page = await paginateWithSearchAfter(query, after);
console.log(`Page ${i + 1}:`, page.hits.length);
after = page.nextAfter;
if (!page.hasMore) break;
}
// PAGINATION STRATEGY 3: scroll (maintains context, deprecated)
async function paginateWithScroll(query: any, pageSize: number = 20) {
// DEPRECATED: Don't use scroll for pagination, use search_after
// Scroll keeps point-in-time snapshot, expensive on cluster
let response = await client.search({
index: 'articles',
body: { query, size: pageSize },
scroll: '1m'
});
const allHits = [];
while (response.body.hits.hits.length > 0) {
allHits.push(...response.body.hits.hits);
response = await client.scroll({
scroll_id: response.body._scroll_id,
scroll: '1m'
});
}
return allHits;
}
// BEST PRACTICE: Use search_after for cursor-based pagination
// from/size for small pages (< 100)
// Avoid scroll for pagination
Index Aliases for Zero-Downtime Reindexing
Aliases allow transparent index swapping without code changes.
// REINDEX PATTERN: Blue-Green Deployment
async function reindexWithAlias() {
// Current index: articles (aliased as articles-read)
// New index: articles-v2 (staging)
// Step 1: Create new index with updated mapping
await client.indices.create({
index: 'articles-v2',
body: {
settings: {
number_of_shards: 3,
number_of_replicas: 1
},
mappings: {
// New mapping with custom analyzer
properties: {
title: {
type: 'text',
analyzer: 'custom_english'
}
}
}
}
});
// Step 2: Copy data from old index to new
const response = await client.reindex({
body: {
source: {
index: 'articles'
},
dest: {
index: 'articles-v2'
},
script: {
source: 'ctx._source.migrated_at = params.now',
params: { now: new Date() }
}
},
wait_for_completion: false // async reindex
});
const taskId = response.body.task;
// Step 3: Monitor reindex progress
const taskStatus = await client.tasks.get({
task_id: taskId
});
console.log(`Reindex progress: ${taskStatus.body.task.status.completed}/${taskStatus.body.task.status.total}`);
// Step 4: Once complete, swap alias (atomic)
await client.indices.updateAliases({
body: {
actions: [
{ remove: { index: 'articles', alias: 'articles-read' } },
{ add: { index: 'articles-v2', alias: 'articles-read' } }
]
}
});
// Step 5: Delete old index after verification
await client.indices.delete({ index: 'articles' });
// Rename articles-v2 to articles
await client.indices.updateAliases({
body: {
actions: [
{ add: { index: 'articles-v2', alias: 'articles' } }
]
}
});
}
// Alias-based routing (all queries use alias)
async function searchArticles(query: any) {
return client.search({
index: 'articles-read', // alias, not concrete index
body: query
});
}
Shard Sizing and Over-Sharding Consequences
Incorrect shard counts cause performance and stability issues.
// SHARD SIZING GUIDANCE
// Optimal shard size: 20-50GB per shard
// Monitor: hot shards, unbalanced shards, shard rejection
// OVER-SHARDING (too many shards):
// Problem: Each shard requires:
// - File descriptors (100 per shard ≈ 100K shards = 10M fds)
// - Memory (heap overhead per shard)
// - Network overhead (coordination)
// Results: slow queries, high GC pressure, cascading failures
// Example: 100-node cluster
// BAD: 10 shards × 100 nodes = 1000 shards (too many)
// GOOD: 1-2 shards × 100 nodes = 100-200 shards (optimal)
// Calculate optimal shards:
// Total data size = 1TB
// Optimal shard size = 30GB
// Optimal shard count = 1000GB / 30GB = 33 shards
// With 3 replicas: 33 shards, 1 replica = 66 total shard copies
const optimalShards = (totalSizeGB: number) => Math.ceil(totalSizeGB / 30);
// Cluster allocation awareness
const clusterSettings = {
persistent: {
'cluster.routing.allocation.awareness.attributes': 'zone',
'cluster.routing.allocation.awareness.force.zone.values': 'zone-a,zone-b,zone-c',
'cluster.max_shards_per_node': 1000 // prevent over-allocation
}
};
// Monitor shard distribution
async function monitorShards() {
const stats = await client.indices.stats();
const shardStats = stats.body.indices.map(index => ({
name: index._index,
docs: index._all.primaries.docs.count,
size: index._all.primaries.store.size_in_bytes,
shards: index._shards.total,
replicas: index.settings.index.number_of_replicas
}));
console.log('Index shard distribution:', shardStats);
}
Monitoring: Hot Threads, Rejected Queues, Node Pressure
Production monitoring prevents cascading failures.
// Monitor hot threads (GC, CPU pressure)
async function checkClusterHealth() {
const health = await client.cluster.health();
console.log('Cluster health:', health.body);
// Output: status (green/yellow/red), active_shards, relocating_shards, initializing_shards
const nodes = await client.nodes.info();
const stats = await client.nodes.stats();
// Check each node
for (const [nodeId, nodeStats] of Object.entries(stats.body.nodes)) {
const node = nodes.body.nodes[nodeId];
console.log(`Node ${node.name}:`);
console.log(` - Heap: ${nodeStats.jvm.mem.heap_used_percent}%`);
console.log(` - GC time: ${nodeStats.jvm.gc.collectors.young.collection_time_in_millis}ms`);
console.log(` - Search queue: ${nodeStats.thread_pool.search.queue}`);
console.log(` - Search rejected: ${nodeStats.thread_pool.search.rejected}`);
console.log(` - Index queue: ${nodeStats.thread_pool.write.queue}`);
console.log(` - Index rejected: ${nodeStats.thread_pool.write.rejected}`);
// Alert if rejected > 0
if (nodeStats.thread_pool.search.rejected > 0) {
console.warn(`⚠️ Node ${node.name} rejecting search requests!`);
}
}
}
// Query monitoring (slow logs)
const slowLogSettings = {
persistent: {
'index.search.slowlog.threshold.query.warn': '10s',
'index.search.slowlog.threshold.query.info': '5s',
'index.search.slowlog.threshold.query.debug': '2s',
'index.search.slowlog.threshold.query.trace': '1s'
}
};
Checklist
- Index mapping designed with correct field types
- Custom analyzer created for specific language/domain
- Multi-match queries with field boosting tested
- Pagination strategy chosen (search_after for large result sets)
- Index aliases implemented for zero-downtime reindexing
- Shard count calculated (20-50GB per shard optimal)
- Cluster awareness configured (zone allocation)
- Monitoring configured (rejected requests, hot nodes)
- Slow query log enabled and reviewed
- Heap size tuned (50% of available RAM, max 31GB)
- refresh_interval optimized (balance freshness vs performance)
- Disk space monitoring configured (stop writes at 85%)
Conclusion
Elasticsearch delivers fast, relevant search when carefully configured. Design mappings with field types and analyzers that match your domain, use multi-match queries with thoughtful field boosting, and implement search_after for efficient pagination. Maintain cluster health through proper shard sizing, monitor for rejected requests, and use aliases for seamless reindexing. With these patterns, you'll avoid the common pitfalls that trip up production Elasticsearch clusters and deliver fast, relevant search experiences at scale.