Skip to main content

Database Indexing Strategies: Performance Optimization for PostgreSQL and MongoDB

Master database indexing strategies for PostgreSQL and MongoDB. Learn index types, optimization techniques, common pitfalls, and best practices to boost query performance.

Table of Contents

Introduction

Database performance is one of the most critical factors in modern application development. As your data grows, queries that once executed in milliseconds can suddenly take seconds or even minutes. The difference between a well-indexed database and a poorly indexed one can be the difference between a responsive application and one that frustrates users.

Indexing is one of the most powerful tools in a database administrator’s toolkit, yet it’s often misunderstood or underutilized. Whether you’re working with PostgreSQL’s relational model or MongoDB’s document-based approach, understanding how to create, maintain, and optimize indexes is essential for building scalable applications.

In this comprehensive guide, we’ll explore indexing strategies for both PostgreSQL and MongoDB, covering everything from basic concepts to advanced optimization techniques. You’ll learn how to identify performance bottlenecks, choose the right index types, avoid common pitfalls, and implement indexing strategies that scale with your application’s growth. By the end, you’ll have the knowledge to transform slow queries into lightning-fast operations.


Understanding Database Indexes

At its core, a database index is a data structure that improves the speed of data retrieval operations. Think of it like an index in a book—instead of scanning every page to find a topic, you can look it up in the index and jump directly to the relevant pages.

How Indexes Work

Indexes work by creating a separate data structure that stores a sorted copy of selected columns from your table (in PostgreSQL) or fields from your documents (in MongoDB). When you query the database, it can use the index to quickly locate the data without scanning the entire table or collection.

-- Without an index, PostgreSQL must scan the entire table
-- This is called a "sequential scan" or "full table scan"
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'user@example.com';
-- Result: Seq Scan on users (cost=0.00..1000.00 rows=1 width=64)
-- Filter: (email = 'user@example.com'::text)
-- Execution time: 45.234 ms
-- With an index on the email column
CREATE INDEX idx_users_email ON users(email);
-- The same query now uses the index
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'user@example.com';
-- Result: Index Scan using idx_users_email on users (cost=0.42..8.44 rows=1 width=64)
-- Index Cond: (email = 'user@example.com'::text)
-- Execution time: 0.123 ms

The Trade-off: Storage vs Speed

⚠️ Important: Indexes come with trade-offs. While they dramatically improve read performance, they:

  • Consume storage space: Each index requires additional disk space
  • Slow down writes: Every INSERT, UPDATE, or DELETE must update the index
  • Require maintenance: Indexes need to be maintained as data changes

The key is finding the right balance—creating indexes for frequently queried columns while avoiding over-indexing that slows down write operations.

When Indexes Are Most Effective

Indexes provide the most benefit when:

High selectivity: The indexed column has many unique values ✅ Frequent queries: The column is used often in WHERE clauses, JOINs, or ORDER BY ✅ Large tables: Tables with thousands or millions of rows benefit most ✅ Read-heavy workloads: Applications with more reads than writes

Low selectivity: Columns with few unique values (like boolean flags) rarely benefit ❌ Write-heavy workloads: Too many indexes can slow down INSERT/UPDATE operations ❌ Small tables: Tables with few rows may not benefit from indexes


PostgreSQL Indexing Strategies

PostgreSQL offers a rich set of indexing options, each optimized for different use cases. Understanding when to use each type is crucial for optimal performance.

B-Tree Indexes: The Default Choice

B-Tree (Balanced Tree) indexes are PostgreSQL’s default index type and work well for most scenarios. They’re efficient for equality and range queries.

-- Create a B-Tree index (default, so you can omit USING btree)
CREATE INDEX idx_users_created_at ON users(created_at);
-- Efficient for range queries
SELECT * FROM users
WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31'
ORDER BY created_at;
-- Efficient for equality queries
SELECT * FROM users WHERE email = 'user@example.com';

Composite Indexes: Multi-Column Optimization

Composite indexes include multiple columns and are powerful for queries that filter or sort by multiple columns.

-- Create a composite index
CREATE INDEX idx_orders_user_status ON orders(user_id, status, created_at);
-- This query can use the index efficiently
SELECT * FROM orders
WHERE user_id = 123 AND status = 'pending'
ORDER BY created_at DESC;
-- ✅ Efficient: Uses the index (leftmost prefix)
SELECT * FROM orders WHERE user_id = 123;
-- ✅ Efficient: Uses the index (leftmost prefix)
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';
-- ❌ Less efficient: Can't use index effectively (status is not leftmost)
SELECT * FROM orders WHERE status = 'pending';

💡 Tip: The order of columns in a composite index matters. Place the most selective column first, and consider the order in which columns appear in your WHERE clauses.

Partial Indexes: Indexing Subsets

Partial indexes only index rows that meet a specific condition, reducing index size and improving performance for filtered queries.

-- Index only active users (assumes most users are inactive)
CREATE INDEX idx_users_active_email ON users(email)
WHERE is_active = true;
-- This query uses the partial index
SELECT * FROM users WHERE email = 'user@example.com' AND is_active = true;
-- This query cannot use the partial index
SELECT * FROM users WHERE email = 'user@example.com' AND is_active = false;

Expression Indexes: Indexing Computed Values

Expression indexes allow you to index the result of a function or expression, useful for case-insensitive searches or computed columns.

-- Case-insensitive email search
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
-- Efficient case-insensitive query
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';
-- Index on computed column
CREATE INDEX idx_orders_total ON orders((quantity * unit_price));
SELECT * FROM orders WHERE (quantity * unit_price) > 1000;

Hash Indexes: Fast Equality Lookups

Hash indexes are optimized for equality comparisons but don’t support range queries or sorting.

-- Create a hash index
CREATE INDEX idx_users_email_hash ON users USING HASH(email);
-- ✅ Efficient: Equality comparison
SELECT * FROM users WHERE email = 'user@example.com';
-- ❌ Cannot use hash index: Range query
SELECT * FROM users WHERE email > 'user@example.com';

⚠️ Note: Hash indexes don’t support range queries or ORDER BY operations. Use B-Tree indexes if you need these features.

GIN and GiST Indexes: Full-Text and Complex Data

GIN (Generalized Inverted Index) and GiST (Generalized Search Tree) indexes are specialized for full-text search, JSONB data, arrays, and geometric data.

-- GIN index for full-text search
CREATE INDEX idx_posts_content_gin ON posts USING GIN(to_tsvector('english', content));
-- Efficient full-text search
SELECT * FROM posts
WHERE to_tsvector('english', content) @@ to_tsquery('english', 'database & performance');
-- GIN index for JSONB
CREATE INDEX idx_users_metadata_gin ON users USING GIN(metadata);
-- Efficient JSONB queries
SELECT * FROM users WHERE metadata @> '{"role": "admin"}'::jsonb;

Covering Indexes: Include Non-Key Columns

Covering indexes (using the INCLUDE clause) store additional columns in the index leaf pages, allowing some queries to be answered entirely from the index without accessing the table.

-- Covering index that includes frequently accessed columns
CREATE INDEX idx_orders_user_covering ON orders(user_id, status)
INCLUDE (total_amount, created_at);
-- This query can be answered entirely from the index
SELECT user_id, status, total_amount, created_at
FROM orders
WHERE user_id = 123 AND status = 'completed';

MongoDB Indexing Strategies

MongoDB’s document-based model requires different indexing strategies, but the core principles remain the same. Understanding MongoDB’s index types and how they interact with queries is essential for optimal performance.

Single Field Indexes: Basic Performance Boost

Single field indexes are the simplest type and work similarly to PostgreSQL’s single-column indexes.

// Create a single field index
db.users.createIndex({ email: 1 });
// Efficient query using the index
db.users.find({ email: "user@example.com" });
// Index direction matters for sorting
db.users.createIndex({ created_at: -1 }); // Descending
db.users.find({}).sort({ created_at: -1 }); // Uses index efficiently

Compound Indexes: Multi-Field Queries

Compound indexes in MongoDB work similarly to PostgreSQL’s composite indexes, with the same leftmost prefix rule.

// Create a compound index
db.orders.createIndex({ user_id: 1, status: 1, created_at: -1 });
// ✅ Efficient: Uses leftmost prefix
db.orders.find({ user_id: 123 });
// ✅ Efficient: Uses leftmost prefix
db.orders.find({ user_id: 123, status: "pending" }).sort({ created_at: -1 });
// ❌ Less efficient: Can't use index effectively
db.orders.find({ status: "pending" });

Multikey Indexes: Indexing Arrays

MongoDB automatically creates multikey indexes when you index an array field, creating an index entry for each array element.

// Create an index on an array field
db.products.createIndex({ tags: 1 });
// Efficient query on array elements
db.products.find({ tags: "electronics" });
// Multikey index on nested arrays
db.users.createIndex({ "addresses.city": 1 });
db.users.find({ "addresses.city": "New York" });

MongoDB’s text indexes provide full-text search capabilities across string content.

// Create a text index
db.articles.createIndex({ title: "text", content: "text" });
// Full-text search query
db.articles.find({ $text: { $search: "database performance" } });
// Text search with relevance scoring
db.articles
.find(
{ $text: { $search: "database performance" } },
{ score: { $meta: "textScore" } },
)
.sort({ score: { $meta: "textScore" } });

Geospatial Indexes: Location-Based Queries

MongoDB supports 2dsphere indexes for geospatial queries, essential for location-based applications.

// Create a 2dsphere index
db.places.createIndex({ location: "2dsphere" });
// Find places near a location
db.places.find({
location: {
$near: {
$geometry: {
type: "Point",
coordinates: [-73.97, 40.77], // Longitude, Latitude
},
$maxDistance: 1000, // meters
},
},
});

Partial Indexes: Conditional Indexing

MongoDB supports partial indexes with a filter expression, similar to PostgreSQL’s partial indexes.

// Index only active users
db.users.createIndex(
{ email: 1 },
{ partialFilterExpression: { is_active: true } },
);
// Efficient query on active users
db.users.find({ email: "user@example.com", is_active: true });
// Cannot use partial index
db.users.find({ email: "user@example.com", is_active: false });

TTL Indexes: Automatic Expiration

TTL (Time To Live) indexes automatically remove documents after a specified time period, useful for session data, logs, or temporary documents.

// Create a TTL index that expires documents after 24 hours
db.sessions.createIndex({ created_at: 1 }, { expireAfterSeconds: 86400 });
// Documents with created_at older than 24 hours are automatically deleted

Sparse Indexes: Indexing Only Existing Fields

Sparse indexes only include documents that have the indexed field, useful for optional fields.

// Create a sparse index
db.users.createIndex({ phone_number: 1 }, { sparse: true });
// Efficient query (only searches indexed documents)
db.users.find({ phone_number: "+1234567890" });

Index Types and When to Use Them

Choosing the right index type depends on your query patterns, data characteristics, and performance requirements. Here’s a comprehensive guide to help you decide.

PostgreSQL Index Type Selection

Index TypeBest ForLimitations
B-TreeMost common use cases, equality and range queries, sortingDefault choice, works for 90% of scenarios
HashFast equality lookups onlyNo range queries, no ORDER BY, not WAL-logged
GINFull-text search, JSONB, arrays, multiple valuesLarger than B-Tree, slower updates
GiSTFull-text search, geometric data, custom data typesSlower than GIN for full-text, more flexible
BRINVery large tables with natural orderingLess flexible than B-Tree, requires sorted data
SP-GiSTSpecialized data structures (trees, tries)Niche use cases

MongoDB Index Type Selection

Index TypeBest ForLimitations
Single FieldSimple queries on one fieldBasic use case
CompoundMulti-field queries and sortingColumn order matters (leftmost prefix)
MultikeyArray fields, nested documentsOne multikey index per document
TextFull-text search across stringsOne text index per collection
2dsphereGeospatial queriesRequires GeoJSON format
TTLAutomatic document expirationOnly works with date fields
SparseOptional fieldsSkips documents without the field
PartialConditional indexingRequires filter expression

Decision Matrix

Use this decision matrix to choose the right index:

-- PostgreSQL: Decision flow
-- 1. Is it a simple equality/range query? → B-Tree
CREATE INDEX idx_users_email ON users(email);
-- 2. Is it full-text search? → GIN
CREATE INDEX idx_posts_content_gin ON posts USING GIN(to_tsvector('english', content));
-- 3. Is it JSONB queries? → GIN
CREATE INDEX idx_users_metadata_gin ON users USING GIN(metadata);
-- 4. Is it a very large table with sorted data? → BRIN
CREATE INDEX idx_logs_timestamp_brin ON logs USING BRIN(timestamp);
// MongoDB: Decision flow
// 1. Simple field query? → Single Field Index
db.users.createIndex({ email: 1 });
// 2. Multiple fields? → Compound Index
db.orders.createIndex({ user_id: 1, status: 1 });
// 3. Array field? → Multikey Index (automatic)
db.products.createIndex({ tags: 1 });
// 4. Full-text search? → Text Index
db.articles.createIndex({ title: "text", content: "text" });
// 5. Geospatial? → 2dsphere Index
db.places.createIndex({ location: "2dsphere" });

Performance Optimization Techniques

Creating indexes is just the beginning. Optimizing them for your specific workload requires understanding query patterns, analyzing execution plans, and fine-tuning based on real-world performance data.

Analyzing Query Performance

Both PostgreSQL and MongoDB provide tools to analyze query performance and understand how indexes are being used.

PostgreSQL: EXPLAIN and EXPLAIN ANALYZE

-- EXPLAIN shows the query plan without executing
EXPLAIN SELECT * FROM users WHERE email = 'user@example.com';
-- EXPLAIN ANALYZE executes the query and shows actual performance
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'user@example.com';
-- Understanding the output:
-- Seq Scan = Full table scan (bad for large tables)
-- Index Scan = Using an index (good)
-- Index Only Scan = Answer from index alone (best)
-- Bitmap Index Scan = Multiple index scans combined
-- Enable detailed query statistics
SET enable_seqscan = off; -- Force index usage (for testing only)
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT * FROM orders
WHERE user_id = 123 AND status = 'pending';

MongoDB: explain() and Execution Stats

// Get query execution plan
db.orders.find({ user_id: 123, status: "pending" }).explain("executionStats");
// Key metrics to watch:
// - executionStats.executionTimeMillis: Total execution time
// - executionStats.totalDocsExamined: Documents scanned
// - executionStats.totalKeysExamined: Index keys examined
// - executionStats.stage: "IXSCAN" = using index (good)
// "COLLSCAN" = collection scan (bad)
// Analyze index usage
db.orders.aggregate([
{ $match: { user_id: 123, status: "pending" } },
{ $explain: true },
]);

Index Selectivity and Cardinality

Selectivity measures how unique the values in an indexed column are. Higher selectivity means better index performance.

-- PostgreSQL: Check index selectivity
SELECT
schemaname,
tablename,
indexname,
idx_scan as index_scans,
idx_tup_read as tuples_read,
idx_tup_fetch as tuples_fetched
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY idx_scan DESC;
-- Calculate selectivity
SELECT
COUNT(DISTINCT status)::float / COUNT(*) as selectivity,
COUNT(*) as total_rows
FROM orders;
-- Selectivity close to 1.0 = high selectivity (good)
-- Selectivity close to 0.0 = low selectivity (bad)
// MongoDB: Check index usage statistics
db.orders.getIndexes();
// Analyze index effectiveness
db.orders.aggregate([
{ $group: { _id: "$status", count: { $sum: 1 } } },
{ $sort: { count: -1 } },
]);

Index Maintenance and Rebuilding

Over time, indexes can become fragmented or bloated, especially in write-heavy workloads. Regular maintenance is essential.

PostgreSQL: REINDEX and VACUUM

-- Rebuild a specific index
REINDEX INDEX CONCURRENTLY idx_users_email;
-- Rebuild all indexes on a table
REINDEX TABLE CONCURRENTLY users;
-- Analyze table statistics (helps query planner)
ANALYZE users;
-- Vacuum to reclaim space and update statistics
VACUUM ANALYZE users;
-- Check index bloat
SELECT
schemaname,
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) as index_size,
idx_scan as index_scans
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;

MongoDB: Rebuild Indexes

// Rebuild all indexes on a collection
db.orders.reIndex();
// Drop and recreate an index
db.orders.dropIndex("user_id_1_status_1");
db.orders.createIndex({ user_id: 1, status: 1 });
// Check index sizes
db.orders.stats().indexSizes;

Query Optimization Strategies

Optimizing queries goes beyond just creating indexes. Understanding how to write queries that leverage indexes effectively is crucial.

PostgreSQL: Query Optimization

-- ✅ Good: Uses index on email
SELECT * FROM users WHERE email = 'user@example.com';
-- ❌ Bad: Function on indexed column prevents index usage
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';
-- Solution: Create expression index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
-- ✅ Good: Uses composite index efficiently
SELECT * FROM orders
WHERE user_id = 123 AND status = 'pending'
ORDER BY created_at DESC;
-- ❌ Bad: OR conditions can prevent index usage
SELECT * FROM orders
WHERE user_id = 123 OR status = 'pending';
-- Solution: Use UNION or separate queries
-- ✅ Good: LIMIT with ORDER BY uses index
SELECT * FROM users ORDER BY created_at DESC LIMIT 10;
-- ❌ Bad: OFFSET without LIMIT scans many rows
SELECT * FROM users ORDER BY created_at DESC OFFSET 10000;
-- Solution: Use cursor-based pagination

MongoDB: Query Optimization

// ✅ Good: Uses index efficiently
db.orders
.find({ user_id: 123, status: "pending" })
.sort({ created_at: -1 })
.limit(10);
// ❌ Bad: $or can prevent index usage
db.orders.find({
$or: [{ user_id: 123 }, { status: "pending" }],
});
// Solution: Use $in or separate queries
// ✅ Good: Compound index supports query and sort
db.orders.find({ user_id: 123 }).sort({ created_at: -1 });
// ❌ Bad: Sort on non-indexed field
db.orders.find({ user_id: 123 }).sort({ total_amount: -1 });
// Solution: Add total_amount to compound index
// ✅ Good: Covered query (all fields in index)
db.orders.find({ user_id: 123 }, { user_id: 1, status: 1, created_at: 1 });
// ❌ Bad: Query requires table access
db.orders.find(
{ user_id: 123 },
{ user_id: 1, status: 1, total_amount: 1 }, // total_amount not in index
);

Index-Only Scans and Covered Queries

The fastest queries are those that can be answered entirely from the index without accessing the table data.

PostgreSQL: Index-Only Scans

-- Create covering index
CREATE INDEX idx_orders_covering ON orders(user_id, status)
INCLUDE (total_amount, created_at);
-- Query that uses index-only scan
EXPLAIN ANALYZE
SELECT user_id, status, total_amount, created_at
FROM orders
WHERE user_id = 123 AND status = 'pending';
-- Result: Index Only Scan using idx_orders_covering
-- Heap Fetches: 0 (no table access needed!)

MongoDB: Covered Queries

// Create compound index
db.orders.createIndex({ user_id: 1, status: 1, created_at: -1 });
// Covered query (all fields in index)
db.orders.find(
{ user_id: 123, status: "pending" },
{ user_id: 1, status: 1, created_at: 1, _id: 0 },
);
// Check if query is covered
db.orders
.find(
{ user_id: 123, status: "pending" },
{ user_id: 1, status: 1, created_at: 1, _id: 0 },
)
.explain("executionStats");
// executionStats.stage should be "IXSCAN"
// executionStats.executionStats.totalDocsExamined should be 0

Common Indexing Mistakes

Even experienced developers make indexing mistakes. Understanding common pitfalls helps you avoid them and optimize your database performance from the start.

Over-Indexing: Too Much of a Good Thing

Creating too many indexes can slow down write operations and consume excessive storage.

-- ❌ Bad: Indexing every column
CREATE INDEX idx_users_id ON users(id);
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_name ON users(name);
CREATE INDEX idx_users_created_at ON users(created_at);
CREATE INDEX idx_users_updated_at ON users(updated_at);
CREATE INDEX idx_users_status ON users(status);
CREATE INDEX idx_users_role ON users(role);
-- ... and so on
-- ✅ Good: Index only frequently queried columns
CREATE INDEX idx_users_email ON users(email); -- Used in WHERE clauses
CREATE INDEX idx_users_created_at ON users(created_at); -- Used for sorting
-- Skip indexes on rarely queried columns

💡 Tip: Monitor index usage and remove unused indexes. A good rule of thumb is to have no more than 5-7 indexes per table, unless you have a specific read-heavy workload.

Under-Indexing: Missing Critical Indexes

Failing to index frequently queried columns results in full table scans and poor performance.

-- ❌ Bad: No index on frequently queried column
SELECT * FROM orders WHERE user_id = 123; -- Full table scan!
-- ✅ Good: Create index on frequently queried column
CREATE INDEX idx_orders_user_id ON orders(user_id);
SELECT * FROM orders WHERE user_id = 123; -- Index scan!

Wrong Column Order in Composite Indexes

The order of columns in composite indexes matters due to the leftmost prefix rule.

-- ❌ Bad: Wrong column order
CREATE INDEX idx_orders_status_user ON orders(status, user_id);
-- Query: WHERE user_id = 123 AND status = 'pending'
-- Cannot use index efficiently (user_id is not leftmost)
-- ✅ Good: Correct column order
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- Query: WHERE user_id = 123 AND status = 'pending'
-- Uses index efficiently

Indexing Low-Selectivity Columns

Indexing columns with few unique values (like boolean flags) rarely improves performance.

-- ❌ Bad: Indexing low-selectivity column
CREATE INDEX idx_users_is_active ON users(is_active);
-- Only 2 unique values (true/false), index provides little benefit
-- ✅ Good: Use partial index instead
CREATE INDEX idx_users_active_email ON users(email)
WHERE is_active = true;
-- Only indexes active users, much more selective

Ignoring Query Patterns

Creating indexes without understanding actual query patterns leads to ineffective indexes.

-- ❌ Bad: Index doesn't match query pattern
CREATE INDEX idx_orders_user ON orders(user_id);
-- But queries always filter by user_id AND status
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';
-- Index helps but not optimal
-- ✅ Good: Index matches query pattern
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- Matches the actual query pattern

Not Considering Sort Operations

Indexes can support both filtering and sorting, but the column order must match the query.

-- ❌ Bad: Index doesn't support sort
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- Query: WHERE user_id = 123 ORDER BY created_at DESC
-- Cannot use index for sorting
-- ✅ Good: Include sort column in index
CREATE INDEX idx_orders_user_status_created ON orders(user_id, status, created_at DESC);
-- Supports both filtering and sorting

Function-Based Queries Without Expression Indexes

Using functions on indexed columns prevents index usage unless you create expression indexes.

-- ❌ Bad: Function prevents index usage
CREATE INDEX idx_users_email ON users(email);
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';
-- Cannot use index (function on indexed column)
-- ✅ Good: Create expression index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';
-- Uses expression index

MongoDB-Specific Mistakes

// ❌ Bad: Index direction doesn't match sort
db.orders.createIndex({ created_at: 1 }); // Ascending
db.orders.find({}).sort({ created_at: -1 }); // Descending sort
// Index can't be used efficiently for sorting
// ✅ Good: Match index direction to sort
db.orders.createIndex({ created_at: -1 }); // Descending
db.orders.find({}).sort({ created_at: -1 }); // Uses index efficiently
// ❌ Bad: Multiple multikey indexes
db.products.createIndex({ tags: 1 });
db.products.createIndex({ categories: 1 }); // Both are arrays
// MongoDB can only use one multikey index per query
// ✅ Good: Compound index with one array field
db.products.createIndex({ tags: 1, price: 1 });

Monitoring and Maintenance

Regular monitoring and maintenance ensure your indexes continue to perform well as your data grows and query patterns evolve.

PostgreSQL Monitoring

-- Check index usage statistics
SELECT
schemaname,
tablename,
indexname,
idx_scan as index_scans,
idx_tup_read as tuples_read,
idx_tup_fetch as tuples_fetched,
pg_size_pretty(pg_relation_size(indexrelid)) as index_size
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY idx_scan ASC; -- Find unused indexes (low scan count)
-- Find unused indexes
SELECT
schemaname,
tablename,
indexname,
idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND schemaname = 'public'
AND indexname NOT LIKE 'pg_toast%';
-- Check index bloat
SELECT
schemaname || '.' || tablename AS table_name,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
idx_scan AS index_scans,
CASE
WHEN idx_scan = 0 THEN 'UNUSED'
WHEN idx_scan < 100 THEN 'LOW USAGE'
ELSE 'ACTIVE'
END AS usage_status
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
ORDER BY pg_relation_size(indexrelid) DESC;
-- Monitor slow queries
SELECT
query,
calls,
total_exec_time,
mean_exec_time,
max_exec_time
FROM pg_stat_statements
WHERE mean_exec_time > 1000 -- Queries taking more than 1 second
ORDER BY mean_exec_time DESC
LIMIT 10;

MongoDB Monitoring

// Check index usage
db.orders.aggregate([{ $indexStats: {} }]);
// Find unused indexes
db.orders.aggregate([
{ $indexStats: {} },
{ $match: { "accesses.ops": { $lt: 100 } } }, // Less than 100 operations
]);
// Monitor slow queries
db.setProfilingLevel(1, { slowms: 100 }); // Log queries slower than 100ms
db.system.profile.find().sort({ ts: -1 }).limit(10);
// Check index sizes
db.orders.stats();
// Analyze collection statistics
db.orders.stats({ indexDetails: true });

Automated Maintenance Scripts

Create maintenance scripts to keep your indexes optimized:

-- PostgreSQL: Automated index maintenance script
DO $$
DECLARE
r RECORD;
BEGIN
FOR r IN
SELECT indexrelid::regclass AS index_name
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
AND idx_scan = 0
AND indexrelid::regclass::text NOT LIKE 'pg_toast%'
LOOP
RAISE NOTICE 'Dropping unused index: %', r.index_name;
-- Uncomment to actually drop:
-- EXECUTE format('DROP INDEX CONCURRENTLY %I', r.index_name);
END LOOP;
END $$;
-- Rebuild fragmented indexes
DO $$
DECLARE
r RECORD;
BEGIN
FOR r IN
SELECT indexrelid::regclass AS index_name
FROM pg_stat_user_indexes
WHERE schemaname = 'public'
AND idx_scan > 1000 -- Only rebuild frequently used indexes
LOOP
RAISE NOTICE 'Rebuilding index: %', r.index_name;
-- Uncomment to actually rebuild:
-- EXECUTE format('REINDEX INDEX CONCURRENTLY %I', r.index_name);
END LOOP;
END $$;
// MongoDB: Automated index maintenance script
function maintainIndexes() {
const collections = db.getCollectionNames();
collections.forEach((collectionName) => {
const collection = db.getCollection(collectionName);
const indexes = collection.getIndexes();
indexes.forEach((index) => {
const stats = collection
.aggregate([{ $indexStats: {} }, { $match: { name: index.name } }])
.toArray();
if (stats.length > 0 && stats[0].accesses.ops < 10) {
print(`Unused index found: ${collectionName}.${index.name}`);
// Uncomment to drop:
// collection.dropIndex(index.name);
}
});
});
}
maintainIndexes();

Performance Baselines and Alerts

Establish performance baselines and set up alerts for degradation:

-- PostgreSQL: Create performance baseline view
CREATE OR REPLACE VIEW index_performance_baseline AS
SELECT
schemaname,
tablename,
indexname,
idx_scan,
pg_size_pretty(pg_relation_size(indexrelid)) as index_size,
NOW() as baseline_date
FROM pg_stat_user_indexes
WHERE schemaname = 'public';
-- Compare current performance to baseline
SELECT
current.indexname,
baseline.idx_scan as baseline_scans,
current.idx_scan as current_scans,
current.idx_scan - baseline.idx_scan as scan_difference
FROM pg_stat_user_indexes current
JOIN index_performance_baseline baseline
ON current.indexname = baseline.indexname
WHERE current.idx_scan < baseline.idx_scan * 0.5; -- 50% decrease in usage

Best Practices

Following these best practices will help you create and maintain efficient indexes that scale with your application.

Index Creation Best Practices

Start with query analysis: Analyze your most common queries before creating indexes ✅ Index foreign keys: Foreign keys are frequently used in JOINs and WHERE clauses ✅ Index columns in WHERE clauses: Prioritize columns used in WHERE conditions ✅ Consider sort operations: Include ORDER BY columns in indexes when possible ✅ Use composite indexes wisely: Create composite indexes for multi-column queries ✅ Monitor index usage: Regularly review which indexes are actually being used ✅ Test with production-like data: Index performance varies with data volume and distribution

Index Maintenance Best Practices

Regular VACUUM/ANALYZE: Keep PostgreSQL statistics up to date ✅ Rebuild fragmented indexes: Periodically rebuild indexes in write-heavy workloads ✅ Remove unused indexes: Drop indexes that aren’t being used ✅ Monitor index bloat: Watch for indexes that consume excessive space ✅ Update statistics: Keep query planner statistics current for optimal plans

Query Writing Best Practices

Write index-friendly queries: Avoid functions on indexed columns when possible ✅ Use EXPLAIN/explain(): Always analyze query plans before deploying ✅ Leverage covering indexes: Design queries to use index-only scans when possible ✅ Avoid SELECT *: Only select columns you need (helps with covering indexes) ✅ Use LIMIT: Limit result sets to reduce data transfer and improve performance

PostgreSQL-Specific Best Practices

-- ✅ Use CONCURRENTLY for production indexes
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);
-- Prevents table locking during index creation
-- ✅ Use partial indexes for filtered data
CREATE INDEX idx_users_active_email ON users(email)
WHERE is_active = true;
-- ✅ Use expression indexes for function-based queries
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
-- ✅ Include columns in covering indexes
CREATE INDEX idx_orders_covering ON orders(user_id, status)
INCLUDE (total_amount, created_at);

MongoDB-Specific Best Practices

// ✅ Use background index creation for large collections
db.users.createIndex({ email: 1 }, { background: true });
// ✅ Create compound indexes that support multiple query patterns
db.orders.createIndex({ user_id: 1, status: 1, created_at: -1 });
// Supports: user_id, user_id + status, user_id + status + sort
// ✅ Use sparse indexes for optional fields
db.users.createIndex({ phone_number: 1 }, { sparse: true });
// ✅ Consider index intersection (MongoDB can use multiple indexes)
// But prefer compound indexes when possible

Performance Testing Best Practices

Test with realistic data volumes: Index performance changes with data size ✅ Test write performance: Indexes impact INSERT/UPDATE/DELETE operations ✅ Monitor query execution times: Track performance metrics over time ✅ Load test: Test indexes under production-like load conditions ✅ A/B test index changes: Compare performance before and after index changes

Documentation Best Practices

Document index purpose: Note why each index exists and what queries it supports ✅ Document maintenance schedule: Record when indexes were last rebuilt or analyzed ✅ Track index changes: Version control index creation scripts ✅ Monitor index metrics: Keep records of index usage and performance over time


Conclusion

Database indexing is both an art and a science. While the fundamental concepts are straightforward, mastering indexing strategies requires understanding your specific data patterns, query workloads, and performance requirements. Whether you’re working with PostgreSQL’s relational model or MongoDB’s document-based approach, the principles remain the same: create indexes that match your query patterns, monitor their effectiveness, and maintain them regularly.

The key takeaways from this guide:

  • Start with query analysis: Understand your query patterns before creating indexes
  • Choose the right index type: B-Tree for most cases, specialized indexes for specific needs
  • Optimize composite indexes: Column order matters due to the leftmost prefix rule
  • Monitor and maintain: Regularly review index usage and rebuild when necessary
  • Avoid common mistakes: Don’t over-index, don’t ignore query patterns, and don’t forget about sort operations

Remember that indexing is an iterative process. As your application evolves and query patterns change, your indexing strategy should evolve too. Regular monitoring and maintenance ensure your indexes continue to provide optimal performance as your data grows.

For more in-depth information, refer to the PostgreSQL documentation on indexes and the MongoDB indexing documentation. If you’re still deciding between databases, check out our comparison guides on PostgreSQL vs MySQL vs SQLite and MongoDB vs PostgreSQL.

By applying these indexing strategies and best practices, you’ll be well-equipped to build fast, scalable database-backed applications that perform well under load and grow with your user base.