- Published on
PostgreSQL JSONB Patterns — Flexible Schema Without Sacrificing Performance
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
JSONB enables flexible schema in PostgreSQL—store and query nested data without migrations—while maintaining relational structure and ACID guarantees. Unlike document databases, JSONB supports complex queries, full-text search integration, and transactions. This guide covers production JSONB patterns: choosing index types, writing efficient queries, validating schema with constraints, and avoiding common performance pitfalls.
- JSONB vs JSON: Storage and Query Differences
- GIN Index Types: jsonb_ops vs jsonb_path_ops
- Common Query Patterns
- Partial Indexes on JSONB Fields
- Schema Validation with Check Constraints
- Migrating from Document Stores to JSONB
- Common Pitfalls and Performance Issues
- Checklist
- Conclusion
JSONB vs JSON: Storage and Query Differences
JSONB is the modern choice. JSON is text-based, JSONB is binary, stored efficiently and indexed.
-- JSON (text-based, slow queries, no indexing)
CREATE TABLE products_json (
id BIGSERIAL PRIMARY KEY,
data JSON -- parsed on every query
);
-- JSONB (binary, indexed, fast queries)
CREATE TABLE products_jsonb (
id BIGSERIAL PRIMARY KEY,
data JSONB -- parsed once, stored efficiently
);
-- Storage comparison (100K products with ~2KB JSON each)
-- JSON: 250MB (text overhead)
-- JSONB: 180MB (compressed binary)
-- Query speed comparison (search product.attributes.color = 'blue')
-- JSON: 4.5 seconds (full scan, re-parse every row)
-- JSONB with index: 15ms (index lookup)
-- JSONB advantages:
-- - Compact binary storage (10-30% smaller)
-- - Native indexing (GIN, BTREE)
-- - Operators: @>, #>, ##>>, ?, ?&, ?|
-- - Functions: jsonb_array_elements, jsonb_each, etc.
-- Always use JSONB unless you have specific JSON-specific needs
-- JSON is deprecated in favor of JSONB
GIN Index Types: jsonb_ops vs jsonb_path_ops
Two index strategies for different query patterns.
-- Sample data
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
doc JSONB,
created_at TIMESTAMP DEFAULT NOW()
);
INSERT INTO documents (doc) VALUES
('{
"user_id": 123,
"product_id": 456,
"attributes": {
"color": "blue",
"size": "large",
"brand": "Nike"
},
"reviews": [
{"rating": 5, "text": "Great!"},
{"rating": 4, "text": "Good"}
]
}');
-- INDEX STRATEGY 1: jsonb_ops (default, GIN index)
-- Best for: Containment checks (@>) and key existence queries
CREATE INDEX idx_doc_ops ON documents USING gin(doc jsonb_ops);
-- Queries that benefit from jsonb_ops:
SELECT * FROM documents WHERE doc @> '{"attributes":{"color":"blue"}}';
-- Uses index: efficient
SELECT * FROM documents WHERE doc ? 'user_id';
-- Uses index: fast key existence check
-- INDEX STRATEGY 2: jsonb_path_ops
-- Best for: Path lookups (#>) and complex nested queries
CREATE INDEX idx_doc_path ON documents USING gin(doc jsonb_path_ops);
-- Queries that benefit from jsonb_path_ops:
SELECT * FROM documents WHERE doc #> '{attributes,color}' = '"blue"';
-- Uses index: efficient path lookup
-- Performance comparison
-- jsonb_ops index size: 50MB (larger)
-- jsonb_path_ops index size: 30MB (smaller)
-- Query latency (@> containment):
-- jsonb_ops: 5ms (optimized)
-- jsonb_path_ops: 50ms (generic)
-- Decision matrix:
-- Use jsonb_ops if: Lots of containment checks (@>), key presence checks
-- Use jsonb_path_ops if: Lots of path navigation (#>), nested key queries
-- Use both if: Mixed workload (more index overhead though)
-- Partial index (index only active JSONB docs)
CREATE INDEX idx_doc_partial ON documents USING gin(doc jsonb_ops)
WHERE doc IS NOT NULL;
-- Multi-key index (index on column + JSONB field)
CREATE INDEX idx_doc_user ON documents USING gin(doc jsonb_ops, created_at DESC)
WHERE (doc->>'user_id')::BIGINT > 0;
Common Query Patterns
Real-world JSONB queries.
-- PATTERN 1: Containment check (@>)
-- Does the JSON contain this key-value pair?
SELECT id FROM documents
WHERE doc @> '{"status":"active"}';
-- Equivalent:
SELECT id FROM documents
WHERE doc->>'status' = 'active';
-- But containment is FASTER with GIN index
-- PATTERN 2: Path navigation (#>>)
-- Get value at nested path (returns text)
SELECT
id,
doc #>> '{attributes,color}' as color,
doc #>> '{user_id}' as user_id
FROM documents;
-- PATTERN 3: Array element access
-- Find documents with array containing specific value
SELECT id FROM documents
WHERE doc @> '{"reviews":[{"rating":5}]}';
-- Or less restrictively:
SELECT id FROM documents
WHERE doc->'reviews' @> '[{"rating":5}]';
-- PATTERN 4: Array element functions
-- Unnest array elements into rows
SELECT
d.id,
jsonb_array_elements(d.doc->'reviews') as review
FROM documents d;
-- Output:
-- id | review
-- 1 | {"rating": 5, "text": "Great!"}
-- 1 | {"rating": 4, "text": "Good"}
-- PATTERN 5: Iterate over object keys
-- Get all key-value pairs from JSONB object
SELECT
d.id,
entry.key,
entry.value
FROM documents d,
jsonb_each(d.doc->'attributes') as entry;
-- PATTERN 6: Key existence check
-- Does JSON have a specific key?
SELECT id FROM documents
WHERE doc ? 'user_id'; -- Top-level key
WHERE doc->'metadata' ? 'color'; -- Nested key
-- PATTERN 7: Multiple key existence (OR)
-- Does JSON have any of these keys?
SELECT id FROM documents
WHERE doc ?| ARRAY['user_id', 'account_id', 'customer_id'];
-- PATTERN 8: All keys exist (AND)
-- Does JSON have all of these keys?
SELECT id FROM documents
WHERE doc ?& ARRAY['status', 'timestamp', 'user_id'];
-- PATTERN 9: Complex filtering with functions
-- Filter on computed JSONB values
SELECT
id,
doc
FROM documents
WHERE (doc->>'price')::NUMERIC > 100
AND (doc->>'stock')::INT > 0
AND doc #>> '{attributes,color}' IS NOT NULL;
-- Performance tip: Index computed columns
CREATE INDEX idx_price_computed ON documents(((doc->>'price')::NUMERIC));
-- PATTERN 10: Full-text search integration
-- Search JSONB content
SELECT id FROM documents
WHERE to_tsvector('english', doc->>'description') @@ plainto_tsquery('english', 'blue');
-- Create GiST index for full-text search
CREATE INDEX idx_doc_fts ON documents
USING gist(to_tsvector('english', doc->>'description'));
Partial Indexes on JSONB Fields
Index only relevant data to reduce index size and improve performance.
-- Partial index: Only active documents
CREATE INDEX idx_active_docs ON documents USING gin(doc jsonb_ops)
WHERE doc->>'status' = 'active';
-- Partial index: Only documents with specific field
CREATE INDEX idx_with_price ON documents USING gin(doc jsonb_ops)
WHERE doc ? 'price';
-- Partial index: Numeric range
CREATE INDEX idx_expensive ON documents USING gin(doc jsonb_ops)
WHERE (doc->>'price')::NUMERIC > 1000;
-- Composite partial index
CREATE INDEX idx_active_user_docs ON documents USING gin(doc jsonb_ops)
WHERE doc->>'status' = 'active'
AND (doc->>'user_id')::BIGINT > 0;
-- Benefits of partial indexes:
-- - Smaller index (20-50% of full index)
-- - Faster index updates (fewer rows to maintain)
-- - Lower memory overhead
-- - More selective, faster queries
-- Query planner must recognize WHERE clause matches index predicate
EXPLAIN SELECT * FROM documents
WHERE doc->>'status' = 'active' AND doc @> '{"verified":true}';
-- Uses partial index if WHERE status='active' is present
-- Maintenance: Monitor partial index growth
SELECT
schemaname,
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) as size
FROM pg_stat_user_indexes
WHERE indexname LIKE '%_docs'
ORDER BY pg_relation_size(indexrelid) DESC;
Schema Validation with Check Constraints
Enforce structure without rigid schema.
-- CONSTRAINT 1: Require specific keys
CREATE TABLE users (
id BIGSERIAL PRIMARY KEY,
profile JSONB,
CONSTRAINT profile_required_keys CHECK (
profile ? 'first_name' AND
profile ? 'last_name' AND
profile ? 'email'
)
);
-- CONSTRAINT 2: Validate field types
CREATE TABLE products (
id BIGSERIAL PRIMARY KEY,
data JSONB,
CONSTRAINT price_is_number CHECK (
jsonb_typeof(data->'price') = 'number'
),
CONSTRAINT quantity_is_number CHECK (
jsonb_typeof(data->'quantity') = 'number'
)
);
-- Valid type names: 'null', 'true', 'false', 'number', 'string', 'array', 'object'
-- CONSTRAINT 3: Validate string format
CREATE TABLE accounts (
id BIGSERIAL PRIMARY KEY,
config JSONB,
CONSTRAINT email_format CHECK (
NOT (config ? 'email') OR
config->>'email' ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}$'
)
);
-- CONSTRAINT 4: Enforce enum-like values
CREATE TABLE orders (
id BIGSERIAL PRIMARY KEY,
data JSONB,
CONSTRAINT status_enum CHECK (
data->>'status' IN ('pending', 'shipped', 'delivered', 'cancelled')
)
);
-- CONSTRAINT 5: Numeric ranges
CREATE TABLE listings (
id BIGSERIAL PRIMARY KEY,
data JSONB,
CONSTRAINT price_positive CHECK (
(data->>'price')::NUMERIC > 0
),
CONSTRAINT rating_in_range CHECK (
(data->>'rating')::NUMERIC >= 0 AND
(data->>'rating')::NUMERIC <= 5
)
);
-- CONSTRAINT 6: Complex validation (array length, object size)
CREATE TABLE surveys (
id BIGSERIAL PRIMARY KEY,
responses JSONB,
CONSTRAINT min_responses CHECK (
jsonb_array_length(responses) >= 3
)
);
-- Add NOT VALID constraint (for large tables, validate later)
ALTER TABLE products ADD CONSTRAINT price_valid CHECK (
jsonb_typeof(data->'price') = 'number'
) NOT VALID;
-- Validate in background
ALTER TABLE products VALIDATE CONSTRAINT price_valid;
-- View constraints
SELECT constraint_name, constraint_def FROM information_schema.check_constraints
WHERE table_name = 'products';
Migrating from Document Stores to JSONB
Real-world migration patterns.
// Node.js: Migrate from MongoDB to Postgres JSONB
import pg from 'pg';
import * as fs from 'fs';
interface MongoDoc {
_id: string;
name: string;
attributes: Record<string, any>;
created_at: Date;
}
async function migrateFromMongo() {
// Step 1: Extract from MongoDB
const mongoClient = new MongoClient('mongodb://localhost:27017');
const mongoDb = mongoClient.db('products');
const mongoCollection = mongoDb.collection('products');
const mongoDocs = await mongoCollection.find({}).toArray();
// Step 2: Create Postgres table
const pgPool = new pg.Pool();
await pgPool.query(`
CREATE TABLE IF NOT EXISTS products (
id UUID PRIMARY KEY,
name VARCHAR(255) NOT NULL,
doc JSONB NOT NULL,
created_at TIMESTAMP,
CONSTRAINT name_not_empty CHECK (name != ''),
CONSTRAINT doc_has_id CHECK (doc ? '_id')
);
CREATE INDEX idx_products_doc ON products USING gin(doc jsonb_ops);
CREATE INDEX idx_products_name ON products(name);
`);
// Step 3: Batch insert (convert Mongo to Postgres)
const batchSize = 1000;
let insertedCount = 0;
for (let i = 0; i < mongoDocs.length; i += batchSize) {
const batch = mongoDocs.slice(i, i + batchSize);
const values = batch.map(doc => [
doc._id,
doc.name,
JSON.stringify(doc), // Store entire doc in JSONB
doc.created_at,
]);
await pgPool.query(
`INSERT INTO products (id, name, doc, created_at) VALUES ($1, $2, $3, $4)`,
values
);
insertedCount += batch.length;
console.log(`Inserted ${insertedCount}/${mongoDocs.length}`);
}
// Step 4: Verify counts match
const pgCount = await pgPool.query('SELECT COUNT(*) FROM products');
const mongoCount = mongoDocs.length;
console.log(`MongoDB: ${mongoCount}, Postgres: ${pgCount.rows[0].count}`);
if (pgCount.rows[0].count !== mongoCount) {
throw new Error('Row count mismatch after migration');
}
await pgPool.end();
await mongoClient.close();
}
// Step 5: Gradually migrate writes (dual-write pattern)
async function insertProduct(doc: any) {
// Write to both databases during transition
try {
// Write to Postgres (new)
await pgPool.query(
`INSERT INTO products (id, name, doc, created_at) VALUES ($1, $2, $3, $4)`,
[doc._id, doc.name, JSON.stringify(doc), new Date()]
);
// Write to MongoDB (old, for safety)
await mongoCollection.insertOne(doc);
} catch (err) {
console.error('Dual-write failed:', err);
throw err;
}
}
// Step 6: Migrate reads (use Postgres, fallback to Mongo)
async function getProduct(id: string) {
try {
const result = await pgPool.query(
'SELECT doc FROM products WHERE id = $1',
[id]
);
if (result.rows.length > 0) {
return result.rows[0].doc;
}
} catch (err) {
console.warn('Postgres query failed, falling back to MongoDB');
}
// Fallback to MongoDB
return mongoCollection.findOne({ _id: id });
}
// Step 7: Verify equivalence before cutover
async function verifyDataParity() {
const sample = 100;
const mongoSample = await mongoCollection.find({}).limit(sample).toArray();
for (const mongoDoc of mongoSample) {
const pgResult = await pgPool.query(
'SELECT doc FROM products WHERE id = $1',
[mongoDoc._id]
);
const pgDoc = pgResult.rows[0]?.doc;
if (JSON.stringify(mongoDoc) !== JSON.stringify(pgDoc)) {
console.error(`Mismatch for ${mongoDoc._id}`);
return false;
}
}
console.log('Data parity verified');
return true;
}
Common Pitfalls and Performance Issues
Avoid these mistakes.
-- PITFALL 1: Using ->> (text) in WHERE clause without casting
-- SLOW: Full table scan, no index used
SELECT * FROM documents WHERE doc->>'price' > '100';
-- FAST: Cast to proper type for index usage
SELECT * FROM documents WHERE (doc->>'price')::NUMERIC > 100;
-- Create index: CREATE INDEX ON documents(((doc->>'price')::NUMERIC));
-- PITFALL 2: Deeply nested JSONB bloat
-- BAD: Storing deeply nested structures
{
"user": {
"profile": {
"settings": {
"notifications": {
"email": true
}
}
}
}
}
-- GOOD: Flatten to reasonable depth
{
"user_id": 123,
"email_notifications": true,
"sms_notifications": false
}
-- PITFALL 3: Over-indexing JSONB
-- BAD: Multiple GIN indexes on same column
CREATE INDEX idx1 ON documents USING gin(doc jsonb_ops);
CREATE INDEX idx2 ON documents USING gin(doc jsonb_path_ops);
-- Index size: 80MB, slower writes
-- GOOD: Single GIN index with strategic partials
CREATE INDEX idx_active ON documents USING gin(doc jsonb_ops)
WHERE doc->>'status' = 'active';
-- Index size: 20MB, faster writes
-- PITFALL 4: Type inconsistency (JSON sometimes number, sometimes string)
-- BAD: Inconsistent types
INSERT INTO products (doc) VALUES ('{"price": 100}');
INSERT INTO products (doc) VALUES ('{"price": "100"}');
-- Queries become fragile:
SELECT * WHERE (doc->>'price')::NUMERIC > 50;
-- First insert works, second fails
-- GOOD: Enforce type consistency with constraints
ALTER TABLE products ADD CONSTRAINT price_numeric CHECK (
jsonb_typeof(doc->'price') = 'number'
);
-- PITFALL 5: Storing very large JSONB documents
-- BAD: Storing 100MB JSON in single row (TOAST overhead)
-- GOOD: Normalize to relational structure or split documents
-- PITFALL 6: Querying unindexed JSONB paths
-- SLOW: No index on nested path
SELECT * FROM documents WHERE doc->'metadata'->'tags' @> '["featured"]';
-- FAST: Create specific index on common path
CREATE INDEX idx_featured_tags ON documents
USING gin((doc->'metadata'->'tags') jsonb_ops);
Checklist
- JSONB type chosen (never use JSON)
- Appropriate GIN index created (jsonb_ops or jsonb_path_ops)
- Partial indexes used to reduce index size
- Check constraints enforce required keys and types
- Computed column indexes created for frequently accessed fields
- Query patterns tested with EXPLAIN
- Type casting applied in WHERE clauses (numeric, integer)
- Full-text search indexes created for text fields
- Migration from NoSQL documented and tested
- JSONB document structure normalized (reasonable depth)
- Application code uses parameterized queries (prevent injection)
- Index bloat monitored and REINDEX scheduled
Conclusion
PostgreSQL JSONB provides document flexibility without sacrificing performance or ACID guarantees. Choose GIN indexes strategically (jsonb_ops for containment, jsonb_path_ops for paths), enforce structure with check constraints, and normalize document nesting to avoid bloat. Index computed columns for fast filtering, use partial indexes to minimize overhead, and migrate from MongoDB or other document stores using dual-write patterns. With proper indexing and constraint design, JSONB queries perform nearly as fast as relational queries while maintaining the schema flexibility of document databases.