Kit ships with Upstash Redis as the caching and rate limiting backbone. The system provides category-based API rate limiting (upload, email, contact, payments, webhooks), tier-based AI quotas (free through enterprise), and a fail-safe design that degrades gracefully when Redis is unavailable.
This page covers setup, the rate limiting architecture, AI quotas, and debugging tools. For the security overview, see Security. For AI-specific cost management, see Cost Management.
How It Works
Rate limiting uses a two-layer architecture — an ephemeral in-memory cache reduces Redis calls by 50-80%, and Redis provides the persistent sliding window counters:
API Request
|
v
Rate Limit Middleware (withRateLimit)
|--- Extracts userId (Clerk) and IP (headers)
|
v
Ephemeral Cache (in-memory Map)
|--- Hit? → Return cached result (no Redis call)
|--- Miss? → Continue to Redis
|
v
Upstash Redis (sliding window algorithm)
|--- Checks user-based limit (if userId available)
|--- Checks IP-based limit (if IP available)
|--- Returns: { success, limit, remaining, reset }
|
v
Response
|--- 200 OK + X-RateLimit-* headers (if allowed)
|--- 429 Too Many Requests (if rate limited)
The ephemeral cache is a module-scope
Map that persists across requests within the same serverless instance. This is safe because rate limiting is inherently approximate — a few extra requests during cache revalidation are acceptable.Setup
1
Create an Upstash Redis database
Sign up at upstash.com and create a new Redis database. Select the region closest to your deployment. The free tier includes 500K commands/month — sufficient for most SaaS applications.
2
Set environment variables
Copy the REST URL and token from your Upstash dashboard and add them to
apps/boilerplate/.env.local:bash
UPSTASH_REDIS_REST_URL=https://your-database.upstash.io
UPSTASH_REDIS_REST_TOKEN=AXxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Redis is optional. When credentials are not configured, the system logs a warning and allows all requests (fail-open). This means you can develop locally without Redis — rate limiting simply won't be enforced.
API Rate Limiting
The API rate limiter uses a category-based system where each API category has independent limits for user-based and IP-based tracking:
src/lib/security/api-rate-limiter.ts — API_LIMITS Configuration
const API_LIMITS: Record<
APICategory,
{ user?: RateLimitConfig; ip?: RateLimitConfig }
> = {
upload: {
user: { requests: 10, window: '1 h', identifier: 'user' },
ip: { requests: 20, window: '1 h', identifier: 'ip' },
},
email: {
user: { requests: 5, window: '1 h', identifier: 'user' },
ip: { requests: 10, window: '1 h', identifier: 'ip' },
},
contact: {
ip: { requests: 3, window: '1 h', identifier: 'ip' },
},
payments: {
user: { requests: 20, window: '1 h', identifier: 'user' },
},
webhooks: {
ip: { requests: 100, window: '1 h', identifier: 'ip' },
},
api: {
user: { requests: 100, window: '1 h', identifier: 'user' },
ip: { requests: 200, window: '1 h', identifier: 'ip' },
},
}
Rate Limit Categories
| Category | User Limit | IP Limit | Use Case |
|---|---|---|---|
upload | 10 req/hour | 20 req/hour | File upload endpoint |
email | 5 req/hour | 10 req/hour | Email sending endpoints |
contact | — | 3 req/hour | Contact form (IP-only, no auth required) |
payments | 20 req/hour | — | Payment and checkout endpoints |
webhooks | — | 100 req/hour | Webhook processing (high throughput) |
api | 100 req/hour | 200 req/hour | General API endpoints (catch-all) |
Each category can have a user-based limit, IP-based limit, or both. When both are configured, the request must pass both checks.
Using the Rate Limit Middleware
The
withRateLimit middleware factory wraps API route handlers with automatic rate limiting:src/lib/security/rate-limit-middleware.ts — withRateLimit Signature
export function withRateLimit(
category: APICategory,
handler: (request: NextRequest) => Promise<NextResponse>
): (request: NextRequest) => Promise<NextResponse> {
Usage in API Routes
typescript
import { withRateLimit } from '@/lib/security/rate-limit-middleware'
// Wrap your handler with a rate limit category
export const POST = withRateLimit('upload', async (request) => {
// Your handler logic — only runs if rate limit passes
return NextResponse.json({ success: true })
})
The middleware automatically:
- Extracts the user ID from Clerk authentication
- Extracts the client IP from
x-forwarded-fororx-real-ipheaders - Checks both user and IP rate limits for the category
- Returns
429 Too Many Requestswith retry information if exceeded - Adds
X-RateLimit-*headers to all responses
AI Rate Limiting
The AI system has its own dedicated rate limiter with tier-based monthly quotas and global burst protection:
Tier-Based Monthly Quotas
| Tier | Monthly Requests | Window | Override Variable |
|---|---|---|---|
| Free | 20 | 30 days | AI_FREE_TIER_REQUESTS |
| Basic | 100 | 30 days | AI_BASIC_TIER_REQUESTS |
| Pro | 1,000 | 30 days | AI_PRO_TIER_REQUESTS |
| Enterprise | 10,000 | 30 days | AI_ENTERPRISE_TIER_REQUESTS |
Three-Layer Check
Every AI request goes through three sequential checks:
- Global burst protection — 10 requests per 10 seconds (configurable via
AI_RATE_LIMIT_MAX_REQUESTSandAI_RATE_LIMIT_WINDOW). This prevents any single user from flooding the AI endpoints. - Credit balance check — If the credit system is enabled (
NEXT_PUBLIC_PRICING_MODE=credits), the system checks whether the user has sufficient credits for the operation. - Auto-reset — If 30+ days have elapsed since the last credit reset, credits are automatically reset as a backup for missed webhook events.
Response Headers
Every rate-limited response includes standard headers so clients can track their usage:
| Header | Value | Description |
|---|---|---|
X-RateLimit-Limit | 100 | Maximum requests allowed in the window |
X-RateLimit-Remaining | 87 | Requests remaining in current window |
X-RateLimit-Reset | 1708012800000 | Unix timestamp (ms) when the window resets |
When the limit is exceeded, the
429 response body includes:json
{
"error": "Rate limit exceeded",
"message": "Too many requests. Please try again in 45 minutes.",
"retryAfter": "45 minutes",
"limit": 10,
"remaining": 0,
"reset": 1708012800000
}
Server-Side Caching Patterns
Beyond rate limiting, Kit uses module-scope caching for configuration data that rarely changes but is read frequently:
typescript
// Example: Caching pricing configuration
// Module-scope variable persists across requests in the same serverless instance
let cachedPricingConfig: PricingConfig | null = null
let lastFetched = 0
const CACHE_TTL = 5 * 60 * 1000 // 5 minutes
export async function getPricingConfig(): Promise<PricingConfig> {
const now = Date.now()
if (cachedPricingConfig && now - lastFetched < CACHE_TTL) {
return cachedPricingConfig
}
cachedPricingConfig = await fetchPricingFromDB()
lastFetched = now
return cachedPricingConfig
}
This pattern is used for payment plan configuration, credit cost tables, and other data that changes infrequently. It avoids database queries on every request without requiring Redis.
For critical configuration like pricing tiers and billing variant IDs, Kit uses a 4-layer validation architecture: Environment Loading → Zod Schema Validation → Cache Storage → TTL-Based Invalidation. The Zod schema validates ALL required fields before caching — this prevents partially-loaded pricing data from being served, which would break payment flows. See
apps/boilerplate/src/lib/payments/config.ts and apps/boilerplate/src/lib/credits/config.ts for the implementation.Module-scope
Map: For data that rarely changes and can tolerate staleness (pricing config, feature flags). Redis: For data that must be shared across serverless instances and requires atomic operations (rate limits, session data). TanStack Query: For client-side server state caching with automatic revalidation.Fail-Safe Design
The rate limiting system is designed to fail open — if Redis is unavailable, requests are allowed through with a warning. This prevents Redis outages from blocking legitimate user traffic:
Redis Available?
|
├── Yes → Normal rate limiting
| (check limits, return success/failure)
|
└── No → Fail-open mode
|--- Log warning: "Rate limiting disabled - Redis not available"
|--- Return success with max limits
|--- Request proceeds normally
When Redis is unavailable, all rate limits are disabled. This is intentional — blocking legitimate requests is worse than temporarily allowing extra requests. Monitor your Redis connection status in production to detect outages early.
This pattern is applied consistently across both the API rate limiter and the AI rate limiter. Individual Redis call failures are also caught and handled with fail-open behavior.
Debugging Tools
Kit includes two scripts for inspecting and resetting Redis rate limit data:
| Script | Command | Purpose |
|---|---|---|
scripts/inspect-redis-keys.ts | npx tsx scripts/inspect-redis-keys.ts | List all rate limit keys in Redis with their values and TTLs |
scripts/reset-rate-limits.ts | npx tsx scripts/reset-rate-limits.ts | Reset all rate limit counters (useful during development) |
Both scripts connect to your Upstash Redis instance using the same environment variables.
If you hit a rate limit during development, run
npx tsx scripts/reset-rate-limits.ts to clear all counters. You can also use the resetAPIRateLimit() function programmatically for specific categories.Redis Key Patterns
Rate limit keys follow a structured prefix pattern for easy identification:
| Prefix | System | Example Key |
|---|---|---|
@upstash/ratelimit/api/{category}/{type} | API rate limiter | @upstash/ratelimit/api/upload/user:user:abc123 |
@upstash/ratelimit/ai/global | AI global burst | @upstash/ratelimit/ai/global:user:abc123 |
@upstash/ratelimit/ai/{tier} | AI tier limit | @upstash/ratelimit/ai/pro:user:abc123 |
All keys are managed by the
@upstash/ratelimit library and include automatic expiration based on the sliding window duration.Environment Variables
| Variable | Required | Default | Purpose |
|---|---|---|---|
UPSTASH_REDIS_REST_URL | No | — | Upstash Redis REST API URL |
UPSTASH_REDIS_REST_TOKEN | No | — | Upstash Redis REST API token |
AI_RATE_LIMIT_WINDOW | No | 10 | Global burst window in seconds |
AI_RATE_LIMIT_MAX_REQUESTS | No | 10 | Global burst max requests per window |
AI_FREE_TIER_REQUESTS | No | 20 | Free tier monthly AI request limit |
AI_BASIC_TIER_REQUESTS | No | 100 | Basic tier monthly AI request limit |
AI_PRO_TIER_REQUESTS | No | 1000 | Pro tier monthly AI request limit |
AI_ENTERPRISE_TIER_REQUESTS | No | 10000 | Enterprise tier monthly AI request limit |
Key Files
| File | Purpose |
|---|---|
apps/boilerplate/src/lib/security/api-rate-limiter.ts | Category-based API rate limiting (upload, email, contact, etc.) |
apps/boilerplate/src/lib/security/rate-limit-middleware.ts | withRateLimit() middleware factory for API routes |
apps/boilerplate/src/lib/ai/rate-limiter.ts | AI-specific rate limiting (tier-based, global burst) |
apps/boilerplate/scripts/inspect-redis-keys.ts | Debug script to inspect Redis keys and TTLs |
apps/boilerplate/scripts/reset-rate-limits.ts | Debug script to reset all rate limit counters |