- Published on
Accidental Full Table Scan — The Query That Brought Down Production
- Authors

- Name
- Sanjeev Sharma
- @webcoderspeed1
Introduction
A full table scan on a 50-million row table doesn't fail — it succeeds, eventually, while consuming all available I/O, blocking other queries, and making your app look like it's down. The worst part: it was fast in staging. The data volume difference between environments hides these bugs completely.
- The Symptoms
- Cause 1: Function Wrapping an Indexed Column
- Cause 2: Implicit Type Cast
- Cause 3: Missing WHERE Clause
- Cause 4: OR Clause Breaking Index
- Diagnosing with EXPLAIN ANALYZE
- Catch Slow Queries in Production
- Full Table Scan Checklist
- Conclusion
The Symptoms
Database CPU: 100%
Active queries: 1 query running for 4 minutes
Other queries: waiting
App: timing out on all requests
Cause: SELECT * FROM orders WHERE DATE(created_at) = '2026-03-14'
This query runs fine with 10,000 rows. With 50 million rows it reads every single row because DATE(created_at) wraps the indexed column in a function — making the index on created_at unusable.
Cause 1: Function Wrapping an Indexed Column
-- ❌ Wrapping indexed column in a function — index not used
SELECT * FROM orders WHERE DATE(created_at) = '2026-03-14'
-- PostgreSQL cannot use the index on created_at because DATE() is applied first
-- ❌ Same problem
SELECT * FROM users WHERE LOWER(email) = 'user@example.com'
-- Index on email not used because LOWER() transforms the column
-- ✅ Rewrite to let the index work
SELECT * FROM orders
WHERE created_at >= '2026-03-14 00:00:00'
AND created_at < '2026-03-15 00:00:00'
-- ✅ Or use a functional index that matches the query
CREATE INDEX ON users (LOWER(email));
SELECT * FROM users WHERE LOWER(email) = 'user@example.com'
-- Now the functional index is used
Cause 2: Implicit Type Cast
-- Table: users.phone is VARCHAR
-- ❌ Passing an integer — implicit cast, index not used
SELECT * FROM users WHERE phone = 1234567890
-- ✅ Match the column type exactly
SELECT * FROM users WHERE phone = '1234567890'
// In Node.js — the same trap
// ❌ userId is a number, but the column is VARCHAR
const result = await db.query('SELECT * FROM sessions WHERE user_id = $1', [req.user.id])
// If user_id column is VARCHAR and req.user.id is a number, PostgreSQL casts and may skip index
// ✅ Ensure types match
const result = await db.query('SELECT * FROM sessions WHERE user_id = $1', [String(req.user.id)])
Cause 3: Missing WHERE Clause
// ❌ Bug: filter variable undefined → full scan
async function getOrders(userId?: string) {
return db.query('SELECT * FROM orders WHERE user_id = $1', [userId])
// If userId is undefined, PostgreSQL sees: WHERE user_id = NULL
// Which matches nothing — but the query still scans the table!
}
// ✅ Guard before querying
async function getOrders(userId: string) {
if (!userId) throw new Error('userId is required')
return db.query('SELECT * FROM orders WHERE user_id = $1', [userId])
}
Cause 4: OR Clause Breaking Index
-- ❌ OR with a non-indexed column forces full scan
SELECT * FROM orders
WHERE user_id = '123'
OR notes LIKE '%refund%' -- notes has no index → full scan
-- ✅ Use UNION instead of OR (if both sides can use indexes separately)
SELECT * FROM orders WHERE user_id = '123'
UNION
SELECT * FROM orders WHERE notes LIKE '%refund%'
-- Or add a GIN index for full-text search on notes:
CREATE INDEX ON orders USING GIN (to_tsvector('english', notes));
SELECT * FROM orders WHERE to_tsvector('english', notes) @@ to_tsquery('refund')
Diagnosing with EXPLAIN ANALYZE
Always run EXPLAIN ANALYZE before deploying queries on large tables:
EXPLAIN ANALYZE
SELECT * FROM orders WHERE DATE(created_at) = '2026-03-14';
-- Output:
-- Seq Scan on orders (cost=0.00..289432.00 rows=50000 width=156)
-- (actual time=0.043..42134.521 rows=1247 loops=1)
-- Filter: (date(created_at) = '2026-03-14'::date)
-- Rows Removed by Filter: 49998753
-- Planning Time: 0.4 ms
-- Execution Time: 42187.3 ms ← 42 seconds, full scan
-- ✅ After fix:
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE created_at >= '2026-03-14' AND created_at < '2026-03-15';
-- Output:
-- Index Scan using orders_created_at_idx on orders
-- (actual time=0.043..1.521 rows=1247 loops=1)
-- Execution Time: 2.1 ms ← 20,000x faster
Look for: Seq Scan (bad) vs Index Scan / Index Only Scan (good). Rows Removed by Filter tells you how many rows were read and discarded.
Catch Slow Queries in Production
-- Enable slow query logging in PostgreSQL
-- postgresql.conf:
-- log_min_duration_statement = 1000 (log queries > 1 second)
-- Find slow queries via pg_stat_statements
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
// Node.js: log slow queries automatically
const pool = new Pool({ connectionString: process.env.DATABASE_URL })
pool.on('connect', (client) => {
const originalQuery = client.query.bind(client)
client.query = (...args: any[]) => {
const start = Date.now()
const result = originalQuery(...args)
if (result && typeof result.then === 'function') {
result.then(() => {
const duration = Date.now() - start
if (duration > 500) {
console.warn(`Slow query (${duration}ms):`, args[0])
}
})
}
return result
}
})
Full Table Scan Checklist
- ✅ Run
EXPLAIN ANALYZEon every new query before deploying - ✅ Never wrap indexed columns in functions (
DATE(),LOWER(), etc.) - ✅ Ensure parameter types match column types exactly
- ✅ Guard against
undefinedparameters that produceWHERE col = NULL - ✅ Use
pg_stat_statementsto find slow queries in production - ✅ Set
log_min_duration_statement = 1000to log slow queries - ✅ Add indexes on all frequently filtered columns
Conclusion
Full table scans are a ticking time bomb — hidden by small data volumes in dev, exposed when production data grows. The fix for most cases is one of: rewrite the query to avoid function-wrapping the indexed column, match parameter types exactly, or add the right index. EXPLAIN ANALYZE is your best tool: run it before deploying any query that touches a large table. One minute of query analysis can save hours of production incident response.