EngineeringPostgreSQLdatabasescaling

Scaling PostgreSQL: From 1,000 to 1,000,000 Queries Per Second

A practical guide to scaling PostgreSQL — connection pooling, read replicas, partitioning, indexing strategies, and when to add caching. Real numbers, real solutions.

RaidFrame Team

October 11, 2025 · 5 min read

PostgreSQL can handle more than you think. Most "scaling problems" are actually configuration problems, missing indexes, or bad query patterns.

This guide walks through the real progression of scaling PostgreSQL from startup scale to serious production load.

Stage 1: 0-1,000 QPS (just configure it properly)

Most PostgreSQL instances are running with default configuration. The defaults are conservative because they target a machine with 128MB of RAM from 2005.

Essential configuration for a 4GB RAM instance:

shared_buffers = 1GB              # 25% of RAM
effective_cache_size = 3GB        # 75% of RAM
work_mem = 16MB                   # per-sort/hash operation
maintenance_work_mem = 256MB      # for VACUUM, CREATE INDEX
max_connections = 100             # keep this low
wal_buffers = 16MB
checkpoint_completion_target = 0.9
random_page_cost = 1.1            # for SSD storage
effective_io_concurrency = 200    # for SSD storage

This alone can 5-10x your throughput.

Stage 2: 1,000-10,000 QPS (indexing and query optimization)

At this scale, the database is doing too much work per query. The fix is almost always better indexes and query optimization.

Find slow queries

Enable pg_stat_statements to track query performance:

CREATE EXTENSION pg_stat_statements;
 
-- Top 10 slowest queries by total time
SELECT query, calls, total_exec_time, mean_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Index strategy

B-tree indexes (default) — great for equality and range queries:

CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_created ON orders(created_at);

Partial indexes — smaller, faster, index only what you query:

CREATE INDEX idx_active_users ON users(email) WHERE active = true;

Composite indexes — match your WHERE clause column order:

-- For: WHERE tenant_id = ? AND created_at > ?
CREATE INDEX idx_orders_tenant_date ON orders(tenant_id, created_at);

Covering indexes (INCLUDE) — avoid table lookups:

CREATE INDEX idx_orders_lookup ON orders(id) INCLUDE (status, total);

Kill N+1 queries

The most common performance killer in ORM-heavy applications. Loading 100 orders and then querying the customer for each one = 101 queries.

-- Bad: N+1
SELECT * FROM orders WHERE status = 'pending';
-- Then for each order: SELECT * FROM customers WHERE id = ?;
 
-- Good: JOIN
SELECT o.*, c.name, c.email
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending';

Try RaidFrame free

Deploy your first app in 60 seconds. No credit card required.

Start free

Stage 3: 10,000-100,000 QPS (connection pooling and read replicas)

Connection pooling

PostgreSQL creates a new process for each connection. 500 connections = 500 processes = significant overhead.

PgBouncer sits between your app and PostgreSQL, multiplexing hundreds of application connections onto a handful of database connections.

[App: 500 connections] → [PgBouncer: 50 connections] → [PostgreSQL]

Configuration:

[databases]
mydb = host=localhost port=5432 dbname=mydb
 
[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 50

Transaction pooling releases the connection back to the pool after each transaction. This is what you want for most applications.

On RaidFrame, PgBouncer is built into managed PostgreSQL. No separate service to manage.

Read replicas

If your workload is 80%+ reads (most web apps are), offload reads to replicas:

Writes → Primary
Reads  → Replica 1, Replica 2, Replica 3

Replication lag is typically <100ms for streaming replication. For most applications, this is invisible.

// In your app
const writePool = new Pool({ connectionString: PRIMARY_URL });
const readPool = new Pool({ connectionString: REPLICA_URL });
 
// Use write pool for mutations
await writePool.query("INSERT INTO orders ...");
 
// Use read pool for queries
const result = await readPool.query("SELECT * FROM orders ...");

Stage 4: 100,000-1,000,000 QPS (partitioning and caching)

Table partitioning

When tables grow to hundreds of millions of rows, even indexed queries slow down. Partitioning splits a table into smaller physical chunks.

CREATE TABLE events (
  id bigint GENERATED ALWAYS AS IDENTITY,
  tenant_id int NOT NULL,
  created_at timestamptz NOT NULL,
  data jsonb
) PARTITION BY RANGE (created_at);
 
CREATE TABLE events_2026_01 PARTITION OF events
  FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
 
CREATE TABLE events_2026_02 PARTITION OF events
  FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

Queries with a WHERE created_at clause only scan relevant partitions. Dropping old data is instant — just detach the partition.

Caching layer

At this scale, add Redis or Memcached in front of PostgreSQL for hot data:

async function getUser(id) {
  // Check cache first
  const cached = await redis.get(`user:${id}`);
  if (cached) return JSON.parse(cached);
 
  // Cache miss — query database
  const user = await db.query("SELECT * FROM users WHERE id = $1", [id]);
 
  // Cache for 5 minutes
  await redis.set(`user:${id}`, JSON.stringify(user), "EX", 300);
  return user;
}

Cache invalidation strategy matters:

TTL-based — simple, good enough for most cases
Write-through — update cache on every write, always fresh
Event-driven — invalidate on database triggers or application events

When PostgreSQL isn't enough

If you've done all of the above and still need more, consider:

Citus — distributed PostgreSQL for multi-node horizontal scaling
Read replicas with sharding — shard by tenant/customer across multiple primaries
Specialized databases — time-series data in TimescaleDB, search in Elasticsearch, graph in Neo4j

But honestly, a properly configured PostgreSQL with connection pooling, read replicas, and caching handles more load than 99% of applications will ever see.

Managed PostgreSQL on RaidFrame

RaidFrame managed PostgreSQL includes:

Optimized configuration based on instance size
Built-in PgBouncer connection pooling
One-click read replicas
Automatic daily backups with point-in-time recovery
Monitoring dashboard with slow query analysis
Auto-scaling storage (never run out of disk)

PostgreSQLdatabasescalingperformance

Ship faster with RaidFrame

Auto-scaling compute, managed databases, global CDN, and zero-config CI/CD. Free tier included.

Start for free View pricing

Keep reading

Engineering