Scaling PostgreSQL: From 1,000 to 1,000,000 Queries Per Second
A practical guide to scaling PostgreSQL — connection pooling, read replicas, partitioning, indexing strategies, and when to add caching. Real numbers, real solutions.
RaidFrame Team
October 11, 2025 · 5 min read
PostgreSQL can handle more than you think. Most "scaling problems" are actually configuration problems, missing indexes, or bad query patterns.
This guide walks through the real progression of scaling PostgreSQL from startup scale to serious production load.
Stage 1: 0-1,000 QPS (just configure it properly)
Most PostgreSQL instances are running with default configuration. The defaults are conservative because they target a machine with 128MB of RAM from 2005.
Essential configuration for a 4GB RAM instance:
shared_buffers = 1GB # 25% of RAM
effective_cache_size = 3GB # 75% of RAM
work_mem = 16MB # per-sort/hash operation
maintenance_work_mem = 256MB # for VACUUM, CREATE INDEX
max_connections = 100 # keep this low
wal_buffers = 16MB
checkpoint_completion_target = 0.9
random_page_cost = 1.1 # for SSD storage
effective_io_concurrency = 200 # for SSD storageThis alone can 5-10x your throughput.
Stage 2: 1,000-10,000 QPS (indexing and query optimization)
At this scale, the database is doing too much work per query. The fix is almost always better indexes and query optimization.
Find slow queries
Enable pg_stat_statements to track query performance:
CREATE EXTENSION pg_stat_statements;
-- Top 10 slowest queries by total time
SELECT query, calls, total_exec_time, mean_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;Index strategy
B-tree indexes (default) — great for equality and range queries:
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_created ON orders(created_at);Partial indexes — smaller, faster, index only what you query:
CREATE INDEX idx_active_users ON users(email) WHERE active = true;Composite indexes — match your WHERE clause column order:
-- For: WHERE tenant_id = ? AND created_at > ?
CREATE INDEX idx_orders_tenant_date ON orders(tenant_id, created_at);Covering indexes (INCLUDE) — avoid table lookups:
CREATE INDEX idx_orders_lookup ON orders(id) INCLUDE (status, total);Kill N+1 queries
The most common performance killer in ORM-heavy applications. Loading 100 orders and then querying the customer for each one = 101 queries.
-- Bad: N+1
SELECT * FROM orders WHERE status = 'pending';
-- Then for each order: SELECT * FROM customers WHERE id = ?;
-- Good: JOIN
SELECT o.*, c.name, c.email
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending';Try RaidFrame free
Deploy your first app in 60 seconds. No credit card required.
Stage 3: 10,000-100,000 QPS (connection pooling and read replicas)
Connection pooling
PostgreSQL creates a new process for each connection. 500 connections = 500 processes = significant overhead.
PgBouncer sits between your app and PostgreSQL, multiplexing hundreds of application connections onto a handful of database connections.
[App: 500 connections] → [PgBouncer: 50 connections] → [PostgreSQL]Configuration:
[databases]
mydb = host=localhost port=5432 dbname=mydb
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 50Transaction pooling releases the connection back to the pool after each transaction. This is what you want for most applications.
On RaidFrame, PgBouncer is built into managed PostgreSQL. No separate service to manage.
Read replicas
If your workload is 80%+ reads (most web apps are), offload reads to replicas:
Writes → Primary
Reads → Replica 1, Replica 2, Replica 3Replication lag is typically <100ms for streaming replication. For most applications, this is invisible.
// In your app
const writePool = new Pool({ connectionString: PRIMARY_URL });
const readPool = new Pool({ connectionString: REPLICA_URL });
// Use write pool for mutations
await writePool.query("INSERT INTO orders ...");
// Use read pool for queries
const result = await readPool.query("SELECT * FROM orders ...");Stage 4: 100,000-1,000,000 QPS (partitioning and caching)
Table partitioning
When tables grow to hundreds of millions of rows, even indexed queries slow down. Partitioning splits a table into smaller physical chunks.
CREATE TABLE events (
id bigint GENERATED ALWAYS AS IDENTITY,
tenant_id int NOT NULL,
created_at timestamptz NOT NULL,
data jsonb
) PARTITION BY RANGE (created_at);
CREATE TABLE events_2026_01 PARTITION OF events
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE events_2026_02 PARTITION OF events
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');Queries with a WHERE created_at clause only scan relevant partitions. Dropping old data is instant — just detach the partition.
Caching layer
At this scale, add Redis or Memcached in front of PostgreSQL for hot data:
async function getUser(id) {
// Check cache first
const cached = await redis.get(`user:${id}`);
if (cached) return JSON.parse(cached);
// Cache miss — query database
const user = await db.query("SELECT * FROM users WHERE id = $1", [id]);
// Cache for 5 minutes
await redis.set(`user:${id}`, JSON.stringify(user), "EX", 300);
return user;
}Cache invalidation strategy matters:
- TTL-based — simple, good enough for most cases
- Write-through — update cache on every write, always fresh
- Event-driven — invalidate on database triggers or application events
When PostgreSQL isn't enough
If you've done all of the above and still need more, consider:
- Citus — distributed PostgreSQL for multi-node horizontal scaling
- Read replicas with sharding — shard by tenant/customer across multiple primaries
- Specialized databases — time-series data in TimescaleDB, search in Elasticsearch, graph in Neo4j
But honestly, a properly configured PostgreSQL with connection pooling, read replicas, and caching handles more load than 99% of applications will ever see.
Managed PostgreSQL on RaidFrame
RaidFrame managed PostgreSQL includes:
- Optimized configuration based on instance size
- Built-in PgBouncer connection pooling
- One-click read replicas
- Automatic daily backups with point-in-time recovery
- Monitoring dashboard with slow query analysis
- Auto-scaling storage (never run out of disk)
Ship faster with RaidFrame
Auto-scaling compute, managed databases, global CDN, and zero-config CI/CD. Free tier included.