Caching & Redis

Kit ships with Upstash Redis as the caching and rate limiting backbone. The system provides category-based API rate limiting (upload, email, contact, payments, webhooks), tier-based AI quotas (free through enterprise), and a fail-safe design that degrades gracefully when Redis is unavailable.

This page covers setup, the rate limiting architecture, AI quotas, and debugging tools. For the security overview, see Security. For AI-specific cost management, see Cost Management.

How It Works

Rate limiting uses a two-layer architecture — an ephemeral in-memory cache reduces Redis calls by 50-80%, and Redis provides the persistent sliding window counters:

API Request
    |
    v
Rate Limit Middleware (withRateLimit)
    |--- Extracts userId (Clerk) and IP (headers)
    |
    v
Ephemeral Cache (in-memory Map)
    |--- Hit? → Return cached result (no Redis call)
    |--- Miss? → Continue to Redis
    |
    v
Upstash Redis (sliding window algorithm)
    |--- Checks user-based limit (if userId available)
    |--- Checks IP-based limit (if IP available)
    |--- Returns: { success, limit, remaining, reset }
    |
    v
Response
    |--- 200 OK + X-RateLimit-* headers (if allowed)
    |--- 429 Too Many Requests (if rate limited)

The ephemeral cache is a module-scope Map that persists across requests within the same serverless instance. This is safe because rate limiting is inherently approximate — a few extra requests during cache revalidation are acceptable.

Setup

Create an Upstash Redis database

Sign up at upstash.com and create a new Redis database. Select the region closest to your deployment. The free tier includes 500K commands/month — sufficient for most SaaS applications.

Set environment variables

Copy the REST URL and token from your Upstash dashboard and add them to apps/boilerplate/.env.local:

bash

UPSTASH_REDIS_REST_URL=https://your-database.upstash.io
UPSTASH_REDIS_REST_TOKEN=AXxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Redis is optional. When credentials are not configured, the system logs a warning and allows all requests (fail-open). This means you can develop locally without Redis — rate limiting simply won't be enforced.

API Rate Limiting

The API rate limiter uses a category-based system where each API category has independent limits for user-based and IP-based tracking:

src/lib/security/api-rate-limiter.ts — API_LIMITS Configuration

const API_LIMITS: Record<
  APICategory,
  { user?: RateLimitConfig; ip?: RateLimitConfig }
> = {
  upload: {
    user: { requests: 10, window: '1 h', identifier: 'user' },
    ip: { requests: 20, window: '1 h', identifier: 'ip' },
  },
  email: {
    user: { requests: 5, window: '1 h', identifier: 'user' },
    ip: { requests: 10, window: '1 h', identifier: 'ip' },
  },
  contact: {
    ip: { requests: 3, window: '1 h', identifier: 'ip' },
  },
  payments: {
    user: { requests: 20, window: '1 h', identifier: 'user' },
  },
  webhooks: {
    ip: { requests: 100, window: '1 h', identifier: 'ip' },
  },
  api: {
    user: { requests: 100, window: '1 h', identifier: 'user' },
    ip: { requests: 200, window: '1 h', identifier: 'ip' },
  },
}

Rate Limit Categories

Category	User Limit	IP Limit	Use Case
`upload`	10 req/hour	20 req/hour	File upload endpoint
`email`	5 req/hour	10 req/hour	Email sending endpoints
`contact`	—	3 req/hour	Contact form (IP-only, no auth required)
`payments`	20 req/hour	—	Payment and checkout endpoints
`webhooks`	—	100 req/hour	Webhook processing (high throughput)
`api`	100 req/hour	200 req/hour	General API endpoints (catch-all)

Each category can have a user-based limit, IP-based limit, or both. When both are configured, the request must pass both checks.

Using the Rate Limit Middleware

The withRateLimit middleware factory wraps API route handlers with automatic rate limiting:

src/lib/security/rate-limit-middleware.ts — withRateLimit Signature

export function withRateLimit(
  category: APICategory,
  handler: (request: NextRequest) => Promise<NextResponse>
): (request: NextRequest) => Promise<NextResponse> {

Usage in API Routes

typescript

import { withRateLimit } from '@/lib/security/rate-limit-middleware'

// Wrap your handler with a rate limit category
export const POST = withRateLimit('upload', async (request) => {
  // Your handler logic — only runs if rate limit passes
  return NextResponse.json({ success: true })
})

The middleware automatically:

Extracts the user ID from Clerk authentication
Extracts the client IP from x-forwarded-for or x-real-ip headers
Checks both user and IP rate limits for the category
Returns 429 Too Many Requests with retry information if exceeded
Adds X-RateLimit-* headers to all responses

AI Rate Limiting

The AI system has its own dedicated rate limiter with tier-based monthly quotas and global burst protection:

Tier-Based Monthly Quotas

Tier	Monthly Requests	Window	Override Variable
Free	20	30 days	`AI_FREE_TIER_REQUESTS`
Basic	100	30 days	`AI_BASIC_TIER_REQUESTS`
Pro	1,000	30 days	`AI_PRO_TIER_REQUESTS`
Enterprise	10,000	30 days	`AI_ENTERPRISE_TIER_REQUESTS`

Three-Layer Check

Every AI request goes through three sequential checks:

Global burst protection — 10 requests per 10 seconds (configurable via AI_RATE_LIMIT_MAX_REQUESTS and AI_RATE_LIMIT_WINDOW). This prevents any single user from flooding the AI endpoints.
Credit balance check — If the credit system is enabled (NEXT_PUBLIC_PRICING_MODE=credits), the system checks whether the user has sufficient credits for the operation.
Auto-reset — If 30+ days have elapsed since the last credit reset, credits are automatically reset as a backup for missed webhook events.

Response Headers

Every rate-limited response includes standard headers so clients can track their usage:

Header	Value	Description
`X-RateLimit-Limit`	`100`	Maximum requests allowed in the window
`X-RateLimit-Remaining`	`87`	Requests remaining in current window
`X-RateLimit-Reset`	`1708012800000`	Unix timestamp (ms) when the window resets

When the limit is exceeded, the 429 response body includes:

json

{
  "error": "Rate limit exceeded",
  "message": "Too many requests. Please try again in 45 minutes.",
  "retryAfter": "45 minutes",
  "limit": 10,
  "remaining": 0,
  "reset": 1708012800000
}

Server-Side Caching Patterns

Beyond rate limiting, Kit uses module-scope caching for configuration data that rarely changes but is read frequently:

typescript

// Example: Caching pricing configuration
// Module-scope variable persists across requests in the same serverless instance
let cachedPricingConfig: PricingConfig | null = null
let lastFetched = 0
const CACHE_TTL = 5 * 60 * 1000 // 5 minutes

export async function getPricingConfig(): Promise<PricingConfig> {
  const now = Date.now()
  if (cachedPricingConfig && now - lastFetched < CACHE_TTL) {
    return cachedPricingConfig
  }

  cachedPricingConfig = await fetchPricingFromDB()
  lastFetched = now
  return cachedPricingConfig
}

This pattern is used for payment plan configuration, credit cost tables, and other data that changes infrequently. It avoids database queries on every request without requiring Redis.

For critical configuration like pricing tiers and billing variant IDs, Kit uses a 4-layer validation architecture: Environment Loading → Zod Schema Validation → Cache Storage → TTL-Based Invalidation. The Zod schema validates ALL required fields before caching — this prevents partially-loaded pricing data from being served, which would break payment flows. See apps/boilerplate/src/lib/payments/config.ts and apps/boilerplate/src/lib/credits/config.ts for the implementation.

Module-scope Map: For data that rarely changes and can tolerate staleness (pricing config, feature flags). Redis: For data that must be shared across serverless instances and requires atomic operations (rate limits, session data). TanStack Query: For client-side server state caching with automatic revalidation.

Fail-Safe Design

The rate limiting system is designed to fail open — if Redis is unavailable, requests are allowed through with a warning. This prevents Redis outages from blocking legitimate user traffic:

Redis Available?
    |
    ├── Yes → Normal rate limiting
    |         (check limits, return success/failure)
    |
    └── No  → Fail-open mode
              |--- Log warning: "Rate limiting disabled - Redis not available"
              |--- Return success with max limits
              |--- Request proceeds normally

When Redis is unavailable, all rate limits are disabled. This is intentional — blocking legitimate requests is worse than temporarily allowing extra requests. Monitor your Redis connection status in production to detect outages early.

This pattern is applied consistently across both the API rate limiter and the AI rate limiter. Individual Redis call failures are also caught and handled with fail-open behavior.

Debugging Tools

Kit includes two scripts for inspecting and resetting Redis rate limit data:

Script	Command	Purpose
`scripts/inspect-redis-keys.ts`	`npx tsx scripts/inspect-redis-keys.ts`	List all rate limit keys in Redis with their values and TTLs
`scripts/reset-rate-limits.ts`	`npx tsx scripts/reset-rate-limits.ts`	Reset all rate limit counters (useful during development)

Both scripts connect to your Upstash Redis instance using the same environment variables.

If you hit a rate limit during development, run npx tsx scripts/reset-rate-limits.ts to clear all counters. You can also use the resetAPIRateLimit() function programmatically for specific categories.

Redis Key Patterns

Rate limit keys follow a structured prefix pattern for easy identification:

Prefix	System	Example Key
`@upstash/ratelimit/api/{category}/{type}`	API rate limiter	`@upstash/ratelimit/api/upload/user:user:abc123`
`@upstash/ratelimit/ai/global`	AI global burst	`@upstash/ratelimit/ai/global:user:abc123`
`@upstash/ratelimit/ai/{tier}`	AI tier limit	`@upstash/ratelimit/ai/pro:user:abc123`

All keys are managed by the @upstash/ratelimit library and include automatic expiration based on the sliding window duration.

Environment Variables

Variable	Required	Default	Purpose
`UPSTASH_REDIS_REST_URL`	No	—	Upstash Redis REST API URL
`UPSTASH_REDIS_REST_TOKEN`	No	—	Upstash Redis REST API token
`AI_RATE_LIMIT_WINDOW`	No	`10`	Global burst window in seconds
`AI_RATE_LIMIT_MAX_REQUESTS`	No	`10`	Global burst max requests per window
`AI_FREE_TIER_REQUESTS`	No	`20`	Free tier monthly AI request limit
`AI_BASIC_TIER_REQUESTS`	No	`100`	Basic tier monthly AI request limit
`AI_PRO_TIER_REQUESTS`	No	`1000`	Pro tier monthly AI request limit
`AI_ENTERPRISE_TIER_REQUESTS`	No	`10000`	Enterprise tier monthly AI request limit

Key Files

File	Purpose
`apps/boilerplate/src/lib/security/api-rate-limiter.ts`	Category-based API rate limiting (upload, email, contact, etc.)
`apps/boilerplate/src/lib/security/rate-limit-middleware.ts`	`withRateLimit()` middleware factory for API routes
`apps/boilerplate/src/lib/ai/rate-limiter.ts`	AI-specific rate limiting (tier-based, global burst)
`apps/boilerplate/scripts/inspect-redis-keys.ts`	Debug script to inspect Redis keys and TTLs
`apps/boilerplate/scripts/reset-rate-limits.ts`	Debug script to reset all rate limit counters