Published on

Elasticsearch in Production — Index Design, Search Relevance, and Operational Gotchas

Authors

Introduction

Elasticsearch powers search at massive scale, but production deployments require careful index design and operational discipline. Misconfigured mappings cause relevance problems, over-sharding leads to hot shards and rejections, and naive pagination breaks under load. This guide covers production patterns: designing analyzers for quality search results, using aliases for zero-downtime reindexing, and monitoring shards to prevent cascade failures.

Index Mapping Design: Keyword vs Text, Nested vs Object

Mapping decisions affect both search relevance and query performance. Choose field types carefully.

{
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        },
        "analyzer": "standard"
      },
      "content": {
        "type": "text",
        "analyzer": "english"
      },
      "tags": {
        "type": "keyword"
      },
      "published_date": {
        "type": "date",
        "format": "strict_date_optional_time"
      },
      "author": {
        "type": "nested",
        "properties": {
          "name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword"
              }
            }
          },
          "email": {
            "type": "keyword"
          }
        }
      },
      "metadata": {
        "type": "object",
        "enabled": false
      },
      "location": {
        "type": "geo_point"
      }
    }
  }
}

Field Type Guidance:

  • text: Analyzed, full-text search (searchable but not aggregatable)
  • keyword: Not analyzed, exact match, aggregatable (for facets, filters)
  • nested: Array of objects with independent queries (expensive)
  • object: Array of objects without independence (memory efficient)
  • geo_point: Geographic searches (distance filters)
// PUT /articles
// Create index with custom mapping
const mapping = {
  settings: {
    number_of_shards: 3,
    number_of_replicas: 1,
    refresh_interval: '30s',  // balance between freshness and performance
  },
  mappings: {
    properties: {
      id: { type: 'keyword' },
      title: {
        type: 'text',
        analyzer: 'standard',
        fields: {
          keyword: { type: 'keyword' },
          autocomplete: { type: 'text', analyzer: 'autocomplete' }
        }
      },
      description: { type: 'text', analyzer: 'english' },
      content: { type: 'text', analyzer: 'english' },
      category: { type: 'keyword' },
      tags: { type: 'keyword' },
      author_id: { type: 'keyword' },
      published_at: { type: 'date' },
      view_count: { type: 'long' },
      rating: { type: 'float' },
      comments: {
        type: 'nested',
        properties: {
          author: { type: 'keyword' },
          text: { type: 'text' },
          timestamp: { type: 'date' }
        }
      }
    }
  }
};

Analyzer Customization for Search Quality

Analyzers tokenize and normalize text. Custom analyzers improve search relevance.

{
  "settings": {
    "analysis": {
      "analyzers": {
        "custom_english": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "english_stop",
            "english_stemmer"
          ]
        },
        "autocomplete": {
          "type": "custom",
          "tokenizer": "autocomplete_tokenizer",
          "filter": [
            "lowercase"
          ]
        },
        "shingle_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "shingle"
          ]
        }
      },
      "tokenizers": {
        "autocomplete_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 10,
          "token_chars": ["letter", "digit"]
        }
      },
      "filters": {
        "english_stop": {
          "type": "stop",
          "stopwords": "_english_"
        },
        "english_stemmer": {
          "type": "stemmer",
          "language": "english"
        },
        "shingle": {
          "type": "shingle",
          "max_shingle_size": 2
        }
      }
    }
  }
}
// Query with multiple analyzers
const query = {
  query: {
    bool: {
      must: [
        {
          multi_match: {
            query: 'machine learning',
            fields: [
              'title^3',      // boost title 3x
              'content^2',    // boost content 2x
              'tags'
            ],
            type: 'best_fields'  // use best field score
          }
        }
      ],
      filter: [
        {
          range: {
            published_at: {
              gte: 'now-30d',
              lte: 'now'
            }
          }
        }
      ]
    }
  }
};

Multi-Match with Field Boosting

Multi-match queries search across fields with custom relevance weights.

{
  "query": {
    "multi_match": {
      "query": "backend performance",
      "fields": [
        "title^4",
        "description^3",
        "content^2",
        "tags",
        "category"
      ],
      "type": "best_fields",
      "operator": "and",
      "fuzziness": "AUTO",
      "prefix_length": 0,
      "tie_breaker": 0.3
    }
  }
}

multi_match types:

  • best_fields: Use highest-scoring field (default, good for exact matches)
  • most_fields: Combine scores from multiple fields (good for synonyms)
  • cross_fields: Match across fields (good for author + content)
  • phrase: Phrase query across fields (exact phrase matching)
// Real-world multi-match: e-commerce search
const productSearch = {
  query: {
    bool: {
      must: {
        multi_match: {
          query: 'waterproof hiking boots',
          fields: [
            'name^4',        // exact product name most relevant
            'brand^3',
            'category^2',
            'description',
            'tags'
          ],
          type: 'best_fields',
          operator: 'and',   // all terms must match
          minimum_should_match: 2  // at least 2 terms
        }
      },
      filter: [
        {
          range: { price: { gte: 50, lte: 300 } }
        },
        {
          term: { in_stock: true }
        },
        {
          range: { rating: { gte: 4.0 } }
        }
      ]
    }
  },
  sort: [
    { relevance: { order: 'desc' } },
    { rating: { order: 'desc' } },
    { price: { order: 'asc' } }
  ],
  size: 20
};

Pagination: from/size vs search_after vs scroll

Different pagination strategies have different tradeoffs.

// PAGINATION STRATEGY 1: from/size (simple, inefficient for deep pages)
async function paginateWithFromSize(query: any, page: number, pageSize: number = 20) {
  const from = (page - 1) * pageSize;

  // INEFFICIENT: Elasticsearch skips [0, from) documents, defeats index
  const result = await client.search({
    index: 'articles',
    body: {
      query,
      from,
      size: pageSize,
      sort: [{ published_at: { order: 'desc' } }]
    }
  });

  return {
    total: result.body.hits.total.value,
    hits: result.body.hits.hits,
    page,
    pageSize,
    hasMore: from + pageSize < result.body.hits.total.value
  };
}

// Problem: from/size beyond 10,000 causes memory issues
// Elasticsearch loads all matching documents into memory for sorting/ranking

// PAGINATION STRATEGY 2: search_after (efficient, keyset pagination)
async function paginateWithSearchAfter(
  query: any,
  after?: any[],
  pageSize: number = 20
) {
  const body: any = {
    query,
    size: pageSize,
    sort: [
      { published_at: { order: 'desc' } },
      { id: { order: 'asc' } }  // tiebreaker for consistent sorting
    ]
  };

  if (after) {
    body.search_after = after;
  }

  const result = await client.search({
    index: 'articles',
    body
  });

  const hits = result.body.hits.hits;
  const nextAfter = hits.length > 0 ? hits[hits.length - 1].sort : null;

  return {
    hits,
    nextAfter,
    hasMore: hits.length === pageSize
  };
}

// Usage:
let after = null;
for (let i = 0; i < 5; i++) {
  const page = await paginateWithSearchAfter(query, after);
  console.log(`Page ${i + 1}:`, page.hits.length);
  after = page.nextAfter;
  if (!page.hasMore) break;
}

// PAGINATION STRATEGY 3: scroll (maintains context, deprecated)
async function paginateWithScroll(query: any, pageSize: number = 20) {
  // DEPRECATED: Don't use scroll for pagination, use search_after
  // Scroll keeps point-in-time snapshot, expensive on cluster

  let response = await client.search({
    index: 'articles',
    body: { query, size: pageSize },
    scroll: '1m'
  });

  const allHits = [];
  while (response.body.hits.hits.length > 0) {
    allHits.push(...response.body.hits.hits);

    response = await client.scroll({
      scroll_id: response.body._scroll_id,
      scroll: '1m'
    });
  }

  return allHits;
}

// BEST PRACTICE: Use search_after for cursor-based pagination
// from/size for small pages (< 100)
// Avoid scroll for pagination

Index Aliases for Zero-Downtime Reindexing

Aliases allow transparent index swapping without code changes.

// REINDEX PATTERN: Blue-Green Deployment
async function reindexWithAlias() {
  // Current index: articles (aliased as articles-read)
  // New index: articles-v2 (staging)

  // Step 1: Create new index with updated mapping
  await client.indices.create({
    index: 'articles-v2',
    body: {
      settings: {
        number_of_shards: 3,
        number_of_replicas: 1
      },
      mappings: {
        // New mapping with custom analyzer
        properties: {
          title: {
            type: 'text',
            analyzer: 'custom_english'
          }
        }
      }
    }
  });

  // Step 2: Copy data from old index to new
  const response = await client.reindex({
    body: {
      source: {
        index: 'articles'
      },
      dest: {
        index: 'articles-v2'
      },
      script: {
        source: 'ctx._source.migrated_at = params.now',
        params: { now: new Date() }
      }
    },
    wait_for_completion: false  // async reindex
  });

  const taskId = response.body.task;

  // Step 3: Monitor reindex progress
  const taskStatus = await client.tasks.get({
    task_id: taskId
  });
  console.log(`Reindex progress: ${taskStatus.body.task.status.completed}/${taskStatus.body.task.status.total}`);

  // Step 4: Once complete, swap alias (atomic)
  await client.indices.updateAliases({
    body: {
      actions: [
        { remove: { index: 'articles', alias: 'articles-read' } },
        { add: { index: 'articles-v2', alias: 'articles-read' } }
      ]
    }
  });

  // Step 5: Delete old index after verification
  await client.indices.delete({ index: 'articles' });
  // Rename articles-v2 to articles
  await client.indices.updateAliases({
    body: {
      actions: [
        { add: { index: 'articles-v2', alias: 'articles' } }
      ]
    }
  });
}

// Alias-based routing (all queries use alias)
async function searchArticles(query: any) {
  return client.search({
    index: 'articles-read',  // alias, not concrete index
    body: query
  });
}

Shard Sizing and Over-Sharding Consequences

Incorrect shard counts cause performance and stability issues.

// SHARD SIZING GUIDANCE
// Optimal shard size: 20-50GB per shard
// Monitor: hot shards, unbalanced shards, shard rejection

// OVER-SHARDING (too many shards):
// Problem: Each shard requires:
// - File descriptors (100 per shard ≈ 100K shards = 10M fds)
// - Memory (heap overhead per shard)
// - Network overhead (coordination)
// Results: slow queries, high GC pressure, cascading failures

// Example: 100-node cluster
// BAD: 10 shards × 100 nodes = 1000 shards (too many)
// GOOD: 1-2 shards × 100 nodes = 100-200 shards (optimal)

// Calculate optimal shards:
// Total data size = 1TB
// Optimal shard size = 30GB
// Optimal shard count = 1000GB / 30GB = 33 shards
// With 3 replicas: 33 shards, 1 replica = 66 total shard copies

const optimalShards = (totalSizeGB: number) => Math.ceil(totalSizeGB / 30);

// Cluster allocation awareness
const clusterSettings = {
  persistent: {
    'cluster.routing.allocation.awareness.attributes': 'zone',
    'cluster.routing.allocation.awareness.force.zone.values': 'zone-a,zone-b,zone-c',
    'cluster.max_shards_per_node': 1000  // prevent over-allocation
  }
};

// Monitor shard distribution
async function monitorShards() {
  const stats = await client.indices.stats();

  const shardStats = stats.body.indices.map(index => ({
    name: index._index,
    docs: index._all.primaries.docs.count,
    size: index._all.primaries.store.size_in_bytes,
    shards: index._shards.total,
    replicas: index.settings.index.number_of_replicas
  }));

  console.log('Index shard distribution:', shardStats);
}

Monitoring: Hot Threads, Rejected Queues, Node Pressure

Production monitoring prevents cascading failures.

// Monitor hot threads (GC, CPU pressure)
async function checkClusterHealth() {
  const health = await client.cluster.health();
  console.log('Cluster health:', health.body);
  // Output: status (green/yellow/red), active_shards, relocating_shards, initializing_shards

  const nodes = await client.nodes.info();
  const stats = await client.nodes.stats();

  // Check each node
  for (const [nodeId, nodeStats] of Object.entries(stats.body.nodes)) {
    const node = nodes.body.nodes[nodeId];

    console.log(`Node ${node.name}:`);
    console.log(`  - Heap: ${nodeStats.jvm.mem.heap_used_percent}%`);
    console.log(`  - GC time: ${nodeStats.jvm.gc.collectors.young.collection_time_in_millis}ms`);
    console.log(`  - Search queue: ${nodeStats.thread_pool.search.queue}`);
    console.log(`  - Search rejected: ${nodeStats.thread_pool.search.rejected}`);
    console.log(`  - Index queue: ${nodeStats.thread_pool.write.queue}`);
    console.log(`  - Index rejected: ${nodeStats.thread_pool.write.rejected}`);

    // Alert if rejected > 0
    if (nodeStats.thread_pool.search.rejected > 0) {
      console.warn(`⚠️  Node ${node.name} rejecting search requests!`);
    }
  }
}

// Query monitoring (slow logs)
const slowLogSettings = {
  persistent: {
    'index.search.slowlog.threshold.query.warn': '10s',
    'index.search.slowlog.threshold.query.info': '5s',
    'index.search.slowlog.threshold.query.debug': '2s',
    'index.search.slowlog.threshold.query.trace': '1s'
  }
};

Checklist

  • Index mapping designed with correct field types
  • Custom analyzer created for specific language/domain
  • Multi-match queries with field boosting tested
  • Pagination strategy chosen (search_after for large result sets)
  • Index aliases implemented for zero-downtime reindexing
  • Shard count calculated (20-50GB per shard optimal)
  • Cluster awareness configured (zone allocation)
  • Monitoring configured (rejected requests, hot nodes)
  • Slow query log enabled and reviewed
  • Heap size tuned (50% of available RAM, max 31GB)
  • refresh_interval optimized (balance freshness vs performance)
  • Disk space monitoring configured (stop writes at 85%)

Conclusion

Elasticsearch delivers fast, relevant search when carefully configured. Design mappings with field types and analyzers that match your domain, use multi-match queries with thoughtful field boosting, and implement search_after for efficient pagination. Maintain cluster health through proper shard sizing, monitor for rejected requests, and use aliases for seamless reindexing. With these patterns, you'll avoid the common pitfalls that trip up production Elasticsearch clusters and deliver fast, relevant search experiences at scale.