EngineeringAPIrate limitingsecurity

API Rate Limiting: Protect Your API Without Killing UX

How to implement rate limiting that stops abuse without blocking real users. Algorithms, code examples, and production patterns.

RaidFrame Team

February 25, 2026 · 4 min read

TL;DR — Rate limiting protects your API from abuse, runaway clients, and DDoS. The best approach: sliding window with Redis, different limits per auth level, and clear response headers so clients can self-throttle. On RaidFrame, you can also configure rate limiting at the load balancer level — no application code needed.

Why rate limit?

Without rate limiting, a single misbehaving client can:

Exhaust your database connections
Spike your hosting bill (on usage-based platforms)
Degrade performance for all other users
Scrape your entire dataset
Brute-force authentication endpoints

Rate limiting algorithms

Fixed window

Count requests in fixed time windows (e.g., per minute). Simple but has a burst problem at window boundaries — a client can send 100 requests at 11:59:59 and 100 more at 12:00:00.

Sliding window (recommended)

Smooths out the fixed window problem by weighting the previous window:

import Redis from "ioredis";
 
const redis = new Redis(process.env.REDIS_URL);
 
async function slidingWindowRateLimit(
  key: string,
  limit: number,
  windowMs: number
): Promise<{ allowed: boolean; remaining: number; resetMs: number }> {
  const now = Date.now();
  const windowStart = now - windowMs;
 
  const pipe = redis.pipeline();
  pipe.zremrangebyscore(key, 0, windowStart);
  pipe.zadd(key, now.toString(), `${now}-${Math.random()}`);
  pipe.zcard(key);
  pipe.expire(key, Math.ceil(windowMs / 1000));
 
  const results = await pipe.exec();
  const count = results![2][1] as number;
 
  return {
    allowed: count <= limit,
    remaining: Math.max(0, limit - count),
    resetMs: windowMs,
  };
}

Token bucket

Tokens refill at a steady rate. Allows short bursts while maintaining an average rate. Good for APIs where occasional bursts are acceptable.

async function tokenBucket(
  key: string,
  maxTokens: number,
  refillRate: number, // tokens per second
): Promise<{ allowed: boolean; remaining: number }> {
  const now = Date.now();
  const bucketKey = `bucket:${key}`;
 
  const data = await redis.hgetall(bucketKey);
  let tokens = data.tokens ? parseFloat(data.tokens) : maxTokens;
  const lastRefill = data.lastRefill ? parseInt(data.lastRefill) : now;
 
  // Refill tokens based on elapsed time
  const elapsed = (now - lastRefill) / 1000;
  tokens = Math.min(maxTokens, tokens + elapsed * refillRate);
 
  if (tokens < 1) {
    return { allowed: false, remaining: 0 };
  }
 
  tokens -= 1;
  await redis.hset(bucketKey, { tokens: tokens.toString(), lastRefill: now.toString() });
  await redis.expire(bucketKey, Math.ceil(maxTokens / refillRate) + 1);
 
  return { allowed: true, remaining: Math.floor(tokens) };
}

Try RaidFrame free

Deploy your first app in 60 seconds. No credit card required.

Start free

Implementation patterns

Per-user limits

Different limits for different tiers:

function getLimitForUser(user: User): { limit: number; window: number } {
  switch (user.plan) {
    case "free":       return { limit: 60,    window: 60_000 };  // 60/min
    case "pro":        return { limit: 600,   window: 60_000 };  // 600/min
    case "enterprise": return { limit: 6000,  window: 60_000 };  // 6000/min
    default:           return { limit: 20,    window: 60_000 };  // anonymous
  }
}

Per-endpoint limits

Tighter limits on expensive operations:

const endpointLimits: Record<string, { limit: number; window: number }> = {
  "POST /api/auth/login":  { limit: 5,   window: 60_000 },   // 5/min (brute force protection)
  "POST /api/uploads":     { limit: 10,  window: 60_000 },   // 10/min (expensive)
  "GET /api/search":       { limit: 30,  window: 60_000 },   // 30/min (heavy query)
  "default":               { limit: 100, window: 60_000 },   // 100/min (everything else)
};

Response headers

Always tell clients their rate limit status:

function setRateLimitHeaders(res: Response, result: RateLimitResult) {
  res.setHeader("X-RateLimit-Limit", result.limit);
  res.setHeader("X-RateLimit-Remaining", result.remaining);
  res.setHeader("X-RateLimit-Reset", Math.ceil(Date.now() / 1000) + Math.ceil(result.resetMs / 1000));
 
  if (!result.allowed) {
    res.setHeader("Retry-After", Math.ceil(result.resetMs / 1000));
    res.status(429).json({
      error: "rate_limit_exceeded",
      message: `Rate limit exceeded. Retry after ${Math.ceil(result.resetMs / 1000)} seconds.`,
    });
  }
}

Load balancer rate limiting

On RaidFrame, configure rate limiting at the infrastructure level — before your application code runs:

services:
  api:
    load_balancer:
      rate_limit:
        requests_per_second: 100
        burst: 200
      rules:
        - path: "/api/auth/login"
          requests_per_second: 5
          burst: 10
        - path: "/api/webhooks/*"
          requests_per_second: 500
          burst: 1000

This is faster than application-level rate limiting and doesn't consume your compute resources.

FAQ

Should I rate limit by IP or API key?

Both. Use IP-based limiting for unauthenticated endpoints (login, signup). Use API key-based limiting for authenticated endpoints. Combine them for defense in depth.

What status code for rate-limited requests?

429 Too Many Requests. Always include a Retry-After header.

How do I handle rate limits in microservices?

Use a shared Redis instance for rate limit state. All services check the same counters. On RaidFrame, all services share the private network and can access the same Redis.

Won't rate limiting block legitimate traffic spikes?

Set your limits higher than normal traffic and use token bucket for burst tolerance. Monitor your 429 rate — if real users are hitting limits, increase them.

What about distributed rate limiting across regions?

Use a global Redis instance or accept eventually-consistent limits per region. For most apps, per-region limiting is fine — exact global counts aren't necessary.

APIrate limitingsecurityRedisbackend

Ship faster with RaidFrame

Auto-scaling compute, managed databases, global CDN, and zero-config CI/CD. Free tier included.

Start for free View pricing

Keep reading

Operations

API Rate Limiting: Protect Your API Without Killing UX

Why rate limit?

Rate limiting algorithms

Fixed window

Sliding window (recommended)

Token bucket

Implementation patterns

Per-user limits

Per-endpoint limits

Response headers

Load balancer rate limiting

FAQ

Should I rate limit by IP or API key?

What status code for rate-limited requests?

How do I handle rate limits in microservices?

Won't rate limiting block legitimate traffic spikes?

What about distributed rate limiting across regions?

Ship faster with RaidFrame

Keep reading

AI Bots Are Inflating Your Hosting Bill

How to Build and Scale a SaaS on Cloud Infrastructure

Best Tech Stack for Building a SaaS in 2026

Why rate limit?

Rate limiting algorithms

Fixed window

Sliding window (recommended)

Token bucket

Implementation patterns

Per-user limits

Per-endpoint limits

Response headers

Load balancer rate limiting

FAQ

Should I rate limit by IP or API key?

What status code for rate-limited requests?

How do I handle rate limits in microservices?

Won't rate limiting block legitimate traffic spikes?

What about distributed rate limiting across regions?

Related reading

Ship faster with RaidFrame

Keep reading

AI Bots Are Inflating Your Hosting Bill

How to Build and Scale a SaaS on Cloud Infrastructure

Best Tech Stack for Building a SaaS in 2026