EngineeringSaaSscalinginfrastructure

How to Build and Scale a SaaS on Cloud Infrastructure

End-to-end guide to building a SaaS from first user to 10K customers. Stack decisions, infrastructure milestones, and scaling patterns that actually work.

RaidFrame Team

March 5, 2026 · 5 min read

TL;DR — Building a SaaS has three infrastructure phases: build (just ship it), grow (add caching, queues, and monitoring), and scale (multi-region, auto-scaling, compliance). Most teams over-engineer phase 1 and under-invest in phase 2. Here's what to do at each stage.

Phase 1: Build (0-100 users)

Goal: ship and get feedback

Your infrastructure should take less than an hour to set up. If you're spending more time on DevOps than product, you're doing it wrong.

The stack:

# raidframe.yaml
services:
  web:
    type: web
    build:
      dockerfile: Dockerfile
    port: 3000
    scaling:
      min: 1
      max: 1
 
databases:
  main:
    engine: postgres

One service. One database. Deploy and move on.

What to skip:

Redis (your database handles 100 users fine)
Background workers (process everything synchronously)
CDN (your app isn't bandwidth-constrained)
Monitoring beyond uptime checks
Multi-region (pick one region close to your target market)
Microservices (absolutely not)

What NOT to skip:

Automated backups (RaidFrame does this automatically)
SSL (automatic on RaidFrame)
Environment variables for secrets (never hardcode)

Monthly cost: $0-7

Try RaidFrame free

Deploy your first app in 60 seconds. No credit card required.

Start free

Phase 2: Grow (100-5,000 users)

Goal: performance and reliability

You have paying customers. Downtime costs money. Slow pages lose conversions.

Add caching:

rf add redis

const cached = await redis.get(`user:${id}`);
if (cached) return JSON.parse(cached);
 
const user = await db.query("SELECT * FROM users WHERE id = $1", [id]);
await redis.set(`user:${id}`, JSON.stringify(user), "EX", 300);
return user;

Add background jobs:

rf cron add "0 9 * * *" "node scripts/daily-digest.js" --name digest

Move slow operations out of the request path:

Email sending → queue
Image processing → queue
Report generation → cron
Data sync → cron

Add monitoring:

rf alerts create --name "Errors" --metric error_rate --service web --threshold "> 2%" --window 5m --notify slack
rf alerts create --name "Slow" --metric response_time_p99 --service web --threshold "> 1000ms" --window 5m --notify slack

Scale the database:

rf db upgrade main --plan pro

Add a staging environment:

rf env create staging --from production

Test deployments on staging before pushing to production. Preview environments for PRs give you per-feature testing.

Updated stack:

services:
  web:
    type: web
    scaling:
      min: 2
      max: 5
  worker:
    type: worker
    command: node worker.js
 
databases:
  main:
    engine: postgres
    plan: pro
  cache:
    engine: redis

Monthly cost: $25-80

Phase 3: Scale (5,000-50,000+ users)

Goal: handle load, meet compliance, go global

Auto-scaling:

services:
  web:
    scaling:
      min: 4
      max: 40
      target_cpu: 70

Multi-region:

rf regions add eu-west-1
rf db replicas add main --region eu-west-1

Users in Europe get sub-50ms responses instead of 200ms.

Database read replicas:

Route read queries to replicas to reduce load on the primary:

const readDb = new Pool({ connectionString: process.env.DATABASE_REPLICA_URL });
const writeDb = new Pool({ connectionString: process.env.DATABASE_URL });
 
// Reads go to replica
app.get("/api/products", async (req, res) => {
  const products = await readDb.query("SELECT * FROM products");
  res.json(products.rows);
});
 
// Writes go to primary
app.post("/api/orders", async (req, res) => {
  const order = await writeDb.query("INSERT INTO orders...");
  res.json(order.rows[0]);
});

Compliance:

rf compliance enable soc2
rf compliance enable hipaa  # if healthcare

Full-stack search:

rf add search

Don't use ILIKE queries on Postgres at scale. Move to managed search for product search, user search, and content search.

Updated stack:

services:
  web:
    type: web
    scaling:
      min: 4
      max: 40
  api:
    type: web
    scaling:
      min: 2
      max: 20
  worker:
    type: worker
    scaling:
      min: 2
      max: 10
 
databases:
  main:
    engine: postgres
    plan: performance
    read_replicas:
      - region: eu-west-1
  cache:
    engine: redis
    plan: pro
 
search:
  products:
    type: search
 
queues:
  tasks:
    type: queue

Monthly cost: $200-800

Common mistakes at each phase

Phase	Mistake	Fix
Build	Over-engineering (microservices, k8s)	Monolith + single DB
Build	No backups	Use managed DB (automatic)
Grow	No caching	Add Redis for hot paths
Grow	No staging environment	Clone production config
Grow	Ignoring slow queries	Check `rf db insights` weekly
Scale	No auto-scaling	Configure min/max with CPU target
Scale	Single region	Add regions where users are
Scale	No read replicas	Route reads to replicas

FAQ

When should I add a second service?

When you have background work that slows down API responses. Extract it as a worker. This is usually the first split.

When do I need multi-region?

When you have significant users in multiple continents and they complain about latency. Check your analytics — if 30%+ of traffic comes from another continent, add that region.

How much should infrastructure cost relative to revenue?

5-15% of revenue is healthy. Under 5% means you're probably under-investing. Over 20% means you're over-provisioned or on the wrong platform.

Should I use Kubernetes?

Probably not. Kubernetes makes sense at 20+ services with dedicated DevOps staff. Below that, a managed platform like RaidFrame handles everything Kubernetes does with zero operational overhead.

When should I worry about compliance?

As soon as you handle health data (HIPAA), payment cards (PCI), or EU user data (GDPR). Don't wait for a customer to ask — by then it's a scramble.

SaaSscalinginfrastructurestartupcloud hosting

Ship faster with RaidFrame

Auto-scaling compute, managed databases, global CDN, and zero-config CI/CD. Free tier included.

Start for free View pricing

Keep reading

Guides

How to Build and Scale a SaaS on Cloud Infrastructure

Phase 1: Build (0-100 users)

Goal: ship and get feedback

Phase 2: Grow (100-5,000 users)

Goal: performance and reliability

Phase 3: Scale (5,000-50,000+ users)

Goal: handle load, meet compliance, go global

Common mistakes at each phase

FAQ

When should I add a second service?

When do I need multi-region?

How much should infrastructure cost relative to revenue?

Should I use Kubernetes?

When should I worry about compliance?

Ship faster with RaidFrame

Keep reading

Install Kubeadm and Set Up Production Kubernetes in 2026

Deploy Any SaaS Boilerplate for Free

Add Stripe Billing to Your SaaS (Next.js + RaidFrame)

Phase 1: Build (0-100 users)

Goal: ship and get feedback

Phase 2: Grow (100-5,000 users)

Goal: performance and reliability

Phase 3: Scale (5,000-50,000+ users)

Goal: handle load, meet compliance, go global

Common mistakes at each phase

FAQ

When should I add a second service?

When do I need multi-region?

How much should infrastructure cost relative to revenue?

Should I use Kubernetes?

When should I worry about compliance?

Related reading

Ship faster with RaidFrame

Keep reading

Install Kubeadm and Set Up Production Kubernetes in 2026

Deploy Any SaaS Boilerplate for Free

Add Stripe Billing to Your SaaS (Next.js + RaidFrame)