Defending API Gateways with a Multi-Tiered IP-Aware Rate Limiter

One morning in early 2025, my phone woke me up with a series of critical alerts from our communications gateway: our Twilio SMS billing had spiked by $1,400 in under three hours.

A distributed botnet was targeting our mobile OTP login endpoint. Instead of using a single server, the attacker was cycling through hundreds of rotating proxy IPs, sending automated request scripts. Naive global IP limits were completely useless because each proxy IP only hit the server once or twice.

To defend our infrastructure and communications budget, we engineered a three-tier user-agent aware rate limiter and IP validation gate directly inside MongoDB, blocking 99.9% of the automated script requests.

Here is the post-mortem of the attack and our defense architecture.

Deconstructing the Proxy Tunnel

When a request arrives at your API gateway, looking at req.socket.remoteAddress is usually not enough. If you route traffic through load balancers (like Google Cloud Run, Cloudflare, or AWS ELB), that remote address points to the load balancer itself, not the user.

Instead, proxy networks pass the true user IP inside the X-Forwarded-For header. But this header can easily be forged. To secure it, our gateway sanitizer scans the comma-separated header chain, extracts the first valid public IP address, and screens out private network loops.

export function extractTrueClientIP(headers: Record<string, string | string[] | undefined>): string {
  const forwardedFor = headers['x-forwarded-for'];
  if (typeof forwardedFor === 'string') {
    // The first IP in the forwarded chain is the client origin
    const ips = forwardedFor.split(',').map(ip => ip.trim());
    const clientIp = ips[0];
    
    // Ignore internal RFC1918 private network spaces
    if (clientIp && !clientIp.startsWith('10.') && !clientIp.startsWith('192.168.')) {
      return clientIp;
    }
  }
  return 'unknown-client-ip';
}

API Gateway Security and Rate Limiting Flow

The Three-Tier Gate Architecture

To stop rotating proxy attacks without blocking genuine users who share IP networks (like corporate offices or school campuses), we designed a three-tier sliding check:

Multi-Tier Rate Limiting Architecture

Request User-Agent Check: Immediately drops requests with headers indicating automated HTTP libraries (like python-requests, axios, or curl).
IP Sanitization: Validates client origin by stripping internal headers and private proxy chains.
Multi-Window Sliding Log: Compares request history against three time windows: 5 requests per minute, 15 requests per hour, and 50 requests per day.

Coding the Sliding Log Limiter

Here is our Node.js implementation of the multi-window limiter using MongoDB atomic findOneAndUpdate queries. Instead of loading an external Redis cluster on a tight infrastructure budget, we used an array log inside a single document to handle sliding-window rate tracking.

import { Db, ObjectId } from 'mongodb';

interface RateLimitCheck {
  ip: string;
  identifier: string; // e.g. target Phone Number or Email
  userAgent: string;
}

export async function checkRateLimit(
  db: Db,
  params: RateLimitCheck
): Promise<{ allowed: boolean; reason?: string }> {
  const now = Date.now();
  const limitsCollection = db.collection('rate_limits');

  // 1. Instantly drop common HTTP script libraries
  const ua = params.userAgent.toLowerCase();
  if (ua.includes('python-requests') || ua.includes('axios') || ua.includes('curl')) {
    return { allowed: false, reason: 'Script access forbidden' };
  }

  const minWindow = now - 60 * 1000;          // 1 Minute
  const hourWindow = now - 60 * 60 * 1000;    // 1 Hour
  const dayWindow = now - 24 * 60 * 60 * 1000; // 24 Hours

  // 2. Append request epoch and fetch log history atomically
  const result = await limitsCollection.findOneAndUpdate(
    { 
      ip: params.ip, 
      identifier: params.identifier 
    },
    {
      $push: { requests: now },
      $setOnInsert: { createdAt: new Date() }
    },
    {
      upsert: true,
      returnDocument: 'after'
    }
  );

  if (!result || !result.requests) {
    return { allowed: true };
  }

  const allRequests: number[] = result.requests;

  // 3. Filter timestamps inside window bounds
  const minRequests = allRequests.filter(t => t > minWindow);
  const hourRequests = allRequests.filter(t => t > hourWindow);
  const dayRequests = allRequests.filter(t => t > dayWindow);

  // 4. Verify request counts
  if (minRequests.length > 5) {
    return { allowed: false, reason: 'Limit exceeded: 5 requests per minute allowed.' };
  }
  if (hourRequests.length > 15) {
    return { allowed: false, reason: 'Limit exceeded: 15 requests per hour allowed.' };
  }
  if (dayRequests.length > 50) {
    return { allowed: false, reason: 'Limit exceeded: 50 requests per day allowed.' };
  }

  // 5. Clean up old logs to keep document sizes small
  if (allRequests.length > 100) {
    await limitsCollection.updateOne(
      { _id: result._id },
      {
        $pull: {
          requests: { $lt: dayWindow }
        }
      }
    );
  }

  return { allowed: true };
}

Sliding Window Requests Timeline

Packaging the Gate as Express Middleware

To cleanly integrate this validation across our backend endpoints, we wrapped the verification logic inside a reusable Node.js Express middleware function. This intercepts incoming requests before the server allocates downstream execution threads or invokes third-party SMS APIs.

import { Request, Response, NextFunction } from 'express';
import { MongoClient } from 'mongodb';

export function gatekeeperMiddleware(client: MongoClient) {
  return async (req: Request, res: Response, next: NextFunction) => {
    const db = client.db('security');
    
    // Extract true client IP safely from proxy headers
    const clientIp = extractTrueClientIP(req.headers);
    
    // Identify client by phone/email input in request body
    const identifier = req.body.phone || req.body.email || 'anonymous';
    const userAgent = req.headers['user-agent'] || 'unknown';

    try {
      const check = await checkRateLimit(db, {
        ip: clientIp,
        identifier,
        userAgent
      });

      if (!check.allowed) {
        return res.status(429).json({
          success: false,
          error: 'Too Many Requests',
          message: check.reason
        });
      }

      next(); // Validation passed, forward to main handler
    } catch (err) {
      console.error('[GATEKEEPER ERROR] Rate check failed:', err);
      // Fail open or closed based on system requirements
      next(); 
    }
  };
}

Database Tuning for High-Frequency Access

Because the rate-limiter is executed on every single API hit, the database queries must be lightning-fast. If the rate-check script takes more than 10 milliseconds, the rate-limiter itself will slow down page responses under load.

We optimized our MongoDB instances using the following operations:

Compound Indexing: We created a compound index on the query fields { ip: 1, identifier: 1 } to make findOneAndUpdate runs resolve in sub-milliseconds:
```
db.rate_limits.createIndex({ "ip": 1, "identifier": 1 });
```
Write Concerns: To avoid the latency of waiting for all replica members to acknowledge the log updates, we configured a write concern of w: 1 (local primary replica confirmation only) on our security database.
Automatic TTL Cleanup: While we clean up old arrays inside individual documents inline, we set a TTL (Time-To-Live) index on the createdAt field to ensure MongoDB automatically deletes documents that haven't been touched in over 24 hours:
```
db.rate_limits.createIndex(
  { "createdAt": 1 }, 
  { expireAfterSeconds: 86400 } // Auto-delete document after 24 hours
);
```

By deploying this multi-tiered architecture and cleaning proxy chains, we blocked the botnet and successfully reduced our monthly communications bill to its normal baseline. If you run consumer platforms with SMS-based auth, set up multi-tiered rate limiters before your billing gateway takes a hit.