PostgreSQL JSONB Patterns — Flexible Schema Without Sacrificing Performance

Introduction

JSONB enables flexible schema in PostgreSQL—store and query nested data without migrations—while maintaining relational structure and ACID guarantees. Unlike document databases, JSONB supports complex queries, full-text search integration, and transactions. This guide covers production JSONB patterns: choosing index types, writing efficient queries, validating schema with constraints, and avoiding common performance pitfalls.

JSONB vs JSON: Storage and Query Differences
GIN Index Types: jsonb_ops vs jsonb_path_ops
Common Query Patterns
Partial Indexes on JSONB Fields
Schema Validation with Check Constraints
Migrating from Document Stores to JSONB
Common Pitfalls and Performance Issues
Checklist
Conclusion

JSONB vs JSON: Storage and Query Differences

JSONB is the modern choice. JSON is text-based, JSONB is binary, stored efficiently and indexed.

-- JSON (text-based, slow queries, no indexing)
CREATE TABLE products_json (
  id BIGSERIAL PRIMARY KEY,
  data JSON  -- parsed on every query
);

-- JSONB (binary, indexed, fast queries)
CREATE TABLE products_jsonb (
  id BIGSERIAL PRIMARY KEY,
  data JSONB  -- parsed once, stored efficiently
);

-- Storage comparison (100K products with ~2KB JSON each)
-- JSON: 250MB (text overhead)
-- JSONB: 180MB (compressed binary)

-- Query speed comparison (search product.attributes.color = 'blue')
-- JSON: 4.5 seconds (full scan, re-parse every row)
-- JSONB with index: 15ms (index lookup)

-- JSONB advantages:
-- - Compact binary storage (10-30% smaller)
-- - Native indexing (GIN, BTREE)
-- - Operators: @>, #>, ##>>, ?, ?&, ?|
-- - Functions: jsonb_array_elements, jsonb_each, etc.

-- Always use JSONB unless you have specific JSON-specific needs
-- JSON is deprecated in favor of JSONB

GIN Index Types: jsonb_ops vs jsonb_path_ops

Two index strategies for different query patterns.

-- Sample data
CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  doc JSONB,
  created_at TIMESTAMP DEFAULT NOW()
);

INSERT INTO documents (doc) VALUES
('{
  "user_id": 123,
  "product_id": 456,
  "attributes": {
    "color": "blue",
    "size": "large",
    "brand": "Nike"
  },
  "reviews": [
    {"rating": 5, "text": "Great!"},
    {"rating": 4, "text": "Good"}
  ]
}');

-- INDEX STRATEGY 1: jsonb_ops (default, GIN index)
-- Best for: Containment checks (@>) and key existence queries
CREATE INDEX idx_doc_ops ON documents USING gin(doc jsonb_ops);

-- Queries that benefit from jsonb_ops:
SELECT * FROM documents WHERE doc @> '{"attributes":{"color":"blue"}}';
-- Uses index: efficient

SELECT * FROM documents WHERE doc ? 'user_id';
-- Uses index: fast key existence check

-- INDEX STRATEGY 2: jsonb_path_ops
-- Best for: Path lookups (#>) and complex nested queries
CREATE INDEX idx_doc_path ON documents USING gin(doc jsonb_path_ops);

-- Queries that benefit from jsonb_path_ops:
SELECT * FROM documents WHERE doc #> '{attributes,color}' = '"blue"';
-- Uses index: efficient path lookup

-- Performance comparison
-- jsonb_ops index size: 50MB (larger)
-- jsonb_path_ops index size: 30MB (smaller)

-- Query latency (@> containment):
-- jsonb_ops: 5ms (optimized)
-- jsonb_path_ops: 50ms (generic)

-- Decision matrix:
-- Use jsonb_ops if: Lots of containment checks (@>), key presence checks
-- Use jsonb_path_ops if: Lots of path navigation (#>), nested key queries
-- Use both if: Mixed workload (more index overhead though)

-- Partial index (index only active JSONB docs)
CREATE INDEX idx_doc_partial ON documents USING gin(doc jsonb_ops)
WHERE doc IS NOT NULL;

-- Multi-key index (index on column + JSONB field)
CREATE INDEX idx_doc_user ON documents USING gin(doc jsonb_ops, created_at DESC)
WHERE (doc->>'user_id')::BIGINT > 0;

Common Query Patterns

Real-world JSONB queries.

-- PATTERN 1: Containment check (@>)
-- Does the JSON contain this key-value pair?
SELECT id FROM documents
WHERE doc @> '{"status":"active"}';

-- Equivalent:
SELECT id FROM documents
WHERE doc->>'status' = 'active';
-- But containment is FASTER with GIN index

-- PATTERN 2: Path navigation (#>>)
-- Get value at nested path (returns text)
SELECT
  id,
  doc #>> '{attributes,color}' as color,
  doc #>> '{user_id}' as user_id
FROM documents;

-- PATTERN 3: Array element access
-- Find documents with array containing specific value
SELECT id FROM documents
WHERE doc @> '{"reviews":[{"rating":5}]}';
-- Or less restrictively:
SELECT id FROM documents
WHERE doc->'reviews' @> '[{"rating":5}]';

-- PATTERN 4: Array element functions
-- Unnest array elements into rows
SELECT
  d.id,
  jsonb_array_elements(d.doc->'reviews') as review
FROM documents d;

-- Output:
-- id | review
-- 1  | {"rating": 5, "text": "Great!"}
-- 1  | {"rating": 4, "text": "Good"}

-- PATTERN 5: Iterate over object keys
-- Get all key-value pairs from JSONB object
SELECT
  d.id,
  entry.key,
  entry.value
FROM documents d,
jsonb_each(d.doc->'attributes') as entry;

-- PATTERN 6: Key existence check
-- Does JSON have a specific key?
SELECT id FROM documents
WHERE doc ? 'user_id';  -- Top-level key
WHERE doc->'metadata' ? 'color';  -- Nested key

-- PATTERN 7: Multiple key existence (OR)
-- Does JSON have any of these keys?
SELECT id FROM documents
WHERE doc ?| ARRAY['user_id', 'account_id', 'customer_id'];

-- PATTERN 8: All keys exist (AND)
-- Does JSON have all of these keys?
SELECT id FROM documents
WHERE doc ?& ARRAY['status', 'timestamp', 'user_id'];

-- PATTERN 9: Complex filtering with functions
-- Filter on computed JSONB values
SELECT
  id,
  doc
FROM documents
WHERE (doc->>'price')::NUMERIC > 100
  AND (doc->>'stock')::INT > 0
  AND doc #>> '{attributes,color}' IS NOT NULL;

-- Performance tip: Index computed columns
CREATE INDEX idx_price_computed ON documents(((doc->>'price')::NUMERIC));

-- PATTERN 10: Full-text search integration
-- Search JSONB content
SELECT id FROM documents
WHERE to_tsvector('english', doc->>'description') @@ plainto_tsquery('english', 'blue');

-- Create GiST index for full-text search
CREATE INDEX idx_doc_fts ON documents
USING gist(to_tsvector('english', doc->>'description'));

Partial Indexes on JSONB Fields

Index only relevant data to reduce index size and improve performance.

-- Partial index: Only active documents
CREATE INDEX idx_active_docs ON documents USING gin(doc jsonb_ops)
WHERE doc->>'status' = 'active';

-- Partial index: Only documents with specific field
CREATE INDEX idx_with_price ON documents USING gin(doc jsonb_ops)
WHERE doc ? 'price';

-- Partial index: Numeric range
CREATE INDEX idx_expensive ON documents USING gin(doc jsonb_ops)
WHERE (doc->>'price')::NUMERIC > 1000;

-- Composite partial index
CREATE INDEX idx_active_user_docs ON documents USING gin(doc jsonb_ops)
WHERE doc->>'status' = 'active'
  AND (doc->>'user_id')::BIGINT > 0;

-- Benefits of partial indexes:
-- - Smaller index (20-50% of full index)
-- - Faster index updates (fewer rows to maintain)
-- - Lower memory overhead
-- - More selective, faster queries

-- Query planner must recognize WHERE clause matches index predicate
EXPLAIN SELECT * FROM documents
WHERE doc->>'status' = 'active' AND doc @> '{"verified":true}';
-- Uses partial index if WHERE status='active' is present

-- Maintenance: Monitor partial index growth
SELECT
  schemaname,
  tablename,
  indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) as size
FROM pg_stat_user_indexes
WHERE indexname LIKE '%_docs'
ORDER BY pg_relation_size(indexrelid) DESC;

Schema Validation with Check Constraints

Enforce structure without rigid schema.

-- CONSTRAINT 1: Require specific keys
CREATE TABLE users (
  id BIGSERIAL PRIMARY KEY,
  profile JSONB,
  CONSTRAINT profile_required_keys CHECK (
    profile ? 'first_name' AND
    profile ? 'last_name' AND
    profile ? 'email'
  )
);

-- CONSTRAINT 2: Validate field types
CREATE TABLE products (
  id BIGSERIAL PRIMARY KEY,
  data JSONB,
  CONSTRAINT price_is_number CHECK (
    jsonb_typeof(data->'price') = 'number'
  ),
  CONSTRAINT quantity_is_number CHECK (
    jsonb_typeof(data->'quantity') = 'number'
  )
);

-- Valid type names: 'null', 'true', 'false', 'number', 'string', 'array', 'object'

-- CONSTRAINT 3: Validate string format
CREATE TABLE accounts (
  id BIGSERIAL PRIMARY KEY,
  config JSONB,
  CONSTRAINT email_format CHECK (
    NOT (config ? 'email') OR
    config->>'email' ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}$'
  )
);

-- CONSTRAINT 4: Enforce enum-like values
CREATE TABLE orders (
  id BIGSERIAL PRIMARY KEY,
  data JSONB,
  CONSTRAINT status_enum CHECK (
    data->>'status' IN ('pending', 'shipped', 'delivered', 'cancelled')
  )
);

-- CONSTRAINT 5: Numeric ranges
CREATE TABLE listings (
  id BIGSERIAL PRIMARY KEY,
  data JSONB,
  CONSTRAINT price_positive CHECK (
    (data->>'price')::NUMERIC > 0
  ),
  CONSTRAINT rating_in_range CHECK (
    (data->>'rating')::NUMERIC >= 0 AND
    (data->>'rating')::NUMERIC <= 5
  )
);

-- CONSTRAINT 6: Complex validation (array length, object size)
CREATE TABLE surveys (
  id BIGSERIAL PRIMARY KEY,
  responses JSONB,
  CONSTRAINT min_responses CHECK (
    jsonb_array_length(responses) >= 3
  )
);

-- Add NOT VALID constraint (for large tables, validate later)
ALTER TABLE products ADD CONSTRAINT price_valid CHECK (
  jsonb_typeof(data->'price') = 'number'
) NOT VALID;

-- Validate in background
ALTER TABLE products VALIDATE CONSTRAINT price_valid;

-- View constraints
SELECT constraint_name, constraint_def FROM information_schema.check_constraints
WHERE table_name = 'products';

Migrating from Document Stores to JSONB

Real-world migration patterns.

// Node.js: Migrate from MongoDB to Postgres JSONB
import pg from 'pg';
import * as fs from 'fs';

interface MongoDoc {
  _id: string;
  name: string;
  attributes: Record<string, any>;
  created_at: Date;
}

async function migrateFromMongo() {
  // Step 1: Extract from MongoDB
  const mongoClient = new MongoClient('mongodb://localhost:27017');
  const mongoDb = mongoClient.db('products');
  const mongoCollection = mongoDb.collection('products');

  const mongoDocs = await mongoCollection.find({}).toArray();

  // Step 2: Create Postgres table
  const pgPool = new pg.Pool();

  await pgPool.query(`
    CREATE TABLE IF NOT EXISTS products (
      id UUID PRIMARY KEY,
      name VARCHAR(255) NOT NULL,
      doc JSONB NOT NULL,
      created_at TIMESTAMP,
      CONSTRAINT name_not_empty CHECK (name != ''),
      CONSTRAINT doc_has_id CHECK (doc ? '_id')
    );

    CREATE INDEX idx_products_doc ON products USING gin(doc jsonb_ops);
    CREATE INDEX idx_products_name ON products(name);
  `);

  // Step 3: Batch insert (convert Mongo to Postgres)
  const batchSize = 1000;
  let insertedCount = 0;

  for (let i = 0; i < mongoDocs.length; i += batchSize) {
    const batch = mongoDocs.slice(i, i + batchSize);

    const values = batch.map(doc => [
      doc._id,
      doc.name,
      JSON.stringify(doc),  // Store entire doc in JSONB
      doc.created_at,
    ]);

    await pgPool.query(
      `INSERT INTO products (id, name, doc, created_at) VALUES ($1, $2, $3, $4)`,
      values
    );

    insertedCount += batch.length;
    console.log(`Inserted ${insertedCount}/${mongoDocs.length}`);
  }

  // Step 4: Verify counts match
  const pgCount = await pgPool.query('SELECT COUNT(*) FROM products');
  const mongoCount = mongoDocs.length;

  console.log(`MongoDB: ${mongoCount}, Postgres: ${pgCount.rows[0].count}`);

  if (pgCount.rows[0].count !== mongoCount) {
    throw new Error('Row count mismatch after migration');
  }

  await pgPool.end();
  await mongoClient.close();
}

// Step 5: Gradually migrate writes (dual-write pattern)
async function insertProduct(doc: any) {
  // Write to both databases during transition
  try {
    // Write to Postgres (new)
    await pgPool.query(
      `INSERT INTO products (id, name, doc, created_at) VALUES ($1, $2, $3, $4)`,
      [doc._id, doc.name, JSON.stringify(doc), new Date()]
    );

    // Write to MongoDB (old, for safety)
    await mongoCollection.insertOne(doc);
  } catch (err) {
    console.error('Dual-write failed:', err);
    throw err;
  }
}

// Step 6: Migrate reads (use Postgres, fallback to Mongo)
async function getProduct(id: string) {
  try {
    const result = await pgPool.query(
      'SELECT doc FROM products WHERE id = $1',
      [id]
    );

    if (result.rows.length > 0) {
      return result.rows[0].doc;
    }
  } catch (err) {
    console.warn('Postgres query failed, falling back to MongoDB');
  }

  // Fallback to MongoDB
  return mongoCollection.findOne({ _id: id });
}

// Step 7: Verify equivalence before cutover
async function verifyDataParity() {
  const sample = 100;
  const mongoSample = await mongoCollection.find({}).limit(sample).toArray();

  for (const mongoDoc of mongoSample) {
    const pgResult = await pgPool.query(
      'SELECT doc FROM products WHERE id = $1',
      [mongoDoc._id]
    );

    const pgDoc = pgResult.rows[0]?.doc;

    if (JSON.stringify(mongoDoc) !== JSON.stringify(pgDoc)) {
      console.error(`Mismatch for ${mongoDoc._id}`);
      return false;
    }
  }

  console.log('Data parity verified');
  return true;
}

Common Pitfalls and Performance Issues

Avoid these mistakes.

-- PITFALL 1: Using ->> (text) in WHERE clause without casting
-- SLOW: Full table scan, no index used
SELECT * FROM documents WHERE doc->>'price' > '100';

-- FAST: Cast to proper type for index usage
SELECT * FROM documents WHERE (doc->>'price')::NUMERIC > 100;
-- Create index: CREATE INDEX ON documents(((doc->>'price')::NUMERIC));

-- PITFALL 2: Deeply nested JSONB bloat
-- BAD: Storing deeply nested structures
{
  "user": {
    "profile": {
      "settings": {
        "notifications": {
          "email": true
        }
      }
    }
  }
}

-- GOOD: Flatten to reasonable depth
{
  "user_id": 123,
  "email_notifications": true,
  "sms_notifications": false
}

-- PITFALL 3: Over-indexing JSONB
-- BAD: Multiple GIN indexes on same column
CREATE INDEX idx1 ON documents USING gin(doc jsonb_ops);
CREATE INDEX idx2 ON documents USING gin(doc jsonb_path_ops);
-- Index size: 80MB, slower writes

-- GOOD: Single GIN index with strategic partials
CREATE INDEX idx_active ON documents USING gin(doc jsonb_ops)
WHERE doc->>'status' = 'active';
-- Index size: 20MB, faster writes

-- PITFALL 4: Type inconsistency (JSON sometimes number, sometimes string)
-- BAD: Inconsistent types
INSERT INTO products (doc) VALUES ('{"price": 100}');
INSERT INTO products (doc) VALUES ('{"price": "100"}');
-- Queries become fragile:
SELECT * WHERE (doc->>'price')::NUMERIC > 50;
-- First insert works, second fails

-- GOOD: Enforce type consistency with constraints
ALTER TABLE products ADD CONSTRAINT price_numeric CHECK (
  jsonb_typeof(doc->'price') = 'number'
);

-- PITFALL 5: Storing very large JSONB documents
-- BAD: Storing 100MB JSON in single row (TOAST overhead)
-- GOOD: Normalize to relational structure or split documents

-- PITFALL 6: Querying unindexed JSONB paths
-- SLOW: No index on nested path
SELECT * FROM documents WHERE doc->'metadata'->'tags' @> '["featured"]';

-- FAST: Create specific index on common path
CREATE INDEX idx_featured_tags ON documents
USING gin((doc->'metadata'->'tags') jsonb_ops);

Checklist

Conclusion

PostgreSQL JSONB provides document flexibility without sacrificing performance or ACID guarantees. Choose GIN indexes strategically (jsonb_ops for containment, jsonb_path_ops for paths), enforce structure with check constraints, and normalize document nesting to avoid bloat. Index computed columns for fast filtering, use partial indexes to minimize overhead, and migrate from MongoDB or other document stores using dual-write patterns. With proper indexing and constraint design, JSONB queries perform nearly as fast as relational queries while maintaining the schema flexibility of document databases.