Kit ships with a production-ready AI system that supports four LLM providers (Anthropic, OpenAI, Google, xAI) through a unified Strategy Pattern interface. The system includes two chat modes, streaming SSE responses, pgvector-powered RAG, and a three-layer cost management system.
This page covers the architecture and core concepts. For provider setup, see AI Providers. For the chat system, see Chat System. For knowledge base search, see RAG System. For rate limiting and credits, see Cost Management.
How It Works
Every AI request in Kit flows through the same pipeline — from the React hook to the provider and back:
User types message
|
v
React Hook (useAIChat / useAICompletion)
|--- Manages message history
|--- Handles streaming state
|--- Triggers credit animation
|
v
API Route (/api/ai/stream or /api/ai/chat)
|--- 1. Feature guard (is chat mode enabled?)
|--- 2. Authentication (Clerk → DB user)
|--- 3. Rate limit check (global burst + credit balance)
|--- 4. Credit deduction (BEFORE processing)
|--- 5. Zod request validation
|
v
AI Service (ai-service.ts)
|--- Resolves model aliases
|--- Creates provider instance (cached)
|--- Routes to correct provider
|
v
Provider (OpenAI / Anthropic / Google / xAI)
|--- Sends request to provider API
|--- Handles retries with exponential backoff
|--- Streams response chunks via SSE
|
v
Response flows back
|--- Usage tracked to database (non-blocking)
|--- Credit balance invalidated in TanStack Query cache
|--- Message displayed in chat UI
Provider Architecture
Kit uses the Strategy Pattern for AI providers. Each provider implements the same
BaseProvider abstract class, so switching providers requires zero code changes — only an environment variable update.The system auto-detects which provider to use based on available API keys:
src/lib/ai/config.ts — Auto-Detection
export function getActiveProvider(): AIProvider | null {
// If provider is explicitly set, use it (AI_API_KEY can serve as its key)
if (aiConfig.AI_PROVIDER) {
return aiConfig.AI_PROVIDER
}
// Auto-detect based on available API keys (Anthropic preferred)
if (aiConfig.ANTHROPIC_API_KEY) return 'anthropic'
if (aiConfig.OPENAI_API_KEY) return 'openai'
if (aiConfig.GOOGLE_AI_API_KEY) return 'google'
if (aiConfig.XAI_API_KEY) return 'xai'
return null
}
Each provider has a preconfigured default model optimized for cost-efficiency:
src/lib/ai/config.ts — Default Models
export const DEFAULT_MODELS: Record<AIProvider, string> = {
openai: 'gpt-5-nano',
anthropic: 'claude-haiku-4-5-20251001',
google: 'gemini-2.5-flash',
xai: 'grok-4-1-fast-reasoning',
}
| Provider | Default Model | Context Window | Best For |
|---|---|---|---|
| Anthropic | claude-haiku-4-5 | 200K tokens (1M beta) | Primary — nuanced reasoning, long context |
| OpenAI | gpt-5-nano | 400K tokens | General purpose, RAG embeddings |
gemini-2.5-flash | 1M tokens | Large documents, cost efficiency | |
| xAI | grok-4-1-fast-reasoning | 2M tokens | Real-time data, conversational |
Kit uses the Vercel AI SDK (
ai v4.3.x) for embeddings and the useCompletion hook. The streaming chat (useAIChat) uses a custom SSE parser that supports five response formats across all providers — this provides more robust multi-provider support than the SDK alone.Two Chat Modes
Kit provides two distinct chat experiences, each with its own route, API, and UI:
| Aspect | LLM Chat | RAG Chat |
|---|---|---|
| Route | /dashboard/chat-llm | /dashboard/chat-rag |
| API | /api/ai/stream, /api/ai/chat | /api/ai/rag/ask |
| Hook | useAIChat() | Custom RAG hook |
| Context | Direct LLM conversation | Knowledge base + LLM |
| Feature Flag | NEXT_PUBLIC_AI_LLM_CHAT_ENABLED | NEXT_PUBLIC_AI_RAG_CHAT_ENABLED |
| Token Usage | Full conversation history | ~3-5K tokens (RAG context) |
| Best For | Open-ended conversation, coding help | Product support, FAQ |
Both chat modes are enabled by default. Set the corresponding environment variable to
false to disable either mode. The navigation automatically hides disabled chat modes.Feature Flags
Seven environment variables control which AI features are available. All default to
true (enabled):src/lib/ai/feature-flags.ts — Feature Configuration
export const AI_CHAT_FEATURES = {
/**
* RAG Chat (Modern UI)
* Routes: /dashboard/chat-rag, /api/ai/rag/*
* Features: Modern chat UI, Knowledge Base integration, Source Attribution
*/
ragChat: process.env.NEXT_PUBLIC_AI_RAG_CHAT_ENABLED !== 'false',
/**
* LLM Chat (Direct Chat)
* Routes: /dashboard/chat-llm, /api/ai/chat, /api/ai/stream
* Features: Modern chat UI, Direct LLM conversation, Streaming
*/
llmChat: process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false',
/**
* Vision Chat (Image Analysis in LLM Chat)
* Extends LLM Chat with image upload and analysis capabilities.
* Requires LLM Chat to be enabled. Only active when BOTH flags are true.
* Features: Drag & Drop, Paste, File picker, Base64 image transport
*/
visionChat:
process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false' &&
process.env.NEXT_PUBLIC_AI_VISION_ENABLED !== 'false',
/**
* PDF Chat (Document Analysis in LLM Chat)
* Extends LLM Chat with PDF upload and text extraction capabilities.
* Requires LLM Chat to be enabled. Only active when BOTH flags are true.
* Features: Drag & Drop, File picker, server-side text extraction, all providers
*/
pdfChat:
process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false' &&
process.env.NEXT_PUBLIC_AI_PDF_CHAT_ENABLED !== 'false',
/**
* Audio Input (Speech-to-Text in LLM Chat)
* Extends LLM Chat with microphone recording and Whisper transcription.
* Requires LLM Chat to be enabled. Only active when BOTH flags are true.
* Features: MediaRecorder, Whisper STT, editable transcript in input field
*/
audioInput:
process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false' &&
process.env.NEXT_PUBLIC_AI_AUDIO_INPUT_ENABLED !== 'false',
/**
* Image Generation (Text-to-Image)
* Routes: /dashboard/image-gen, /api/ai/image-gen
* Features: GPT Image models, multiple sizes/qualities/formats, transparent backgrounds
* Standalone feature — does NOT require LLM Chat to be enabled.
*/
imageGen: process.env.NEXT_PUBLIC_AI_IMAGE_GEN_ENABLED !== 'false',
/**
* Content Generator (Template-based Text Generation)
* Routes: /dashboard/content, /api/ai/generate-content
* Features: 5 templates (Email, Product, Blog, Social, Marketing), tone/language/length controls, streaming output
* Standalone feature — does NOT require LLM Chat to be enabled.
*/
contentGen: process.env.NEXT_PUBLIC_AI_CONTENT_GEN_ENABLED !== 'false',
} as const
| Variable | Default | Controls |
|---|---|---|
NEXT_PUBLIC_AI_RAG_CHAT_ENABLED | true | RAG Chat on /dashboard/chat-rag |
NEXT_PUBLIC_AI_LLM_CHAT_ENABLED | true | LLM Chat on /dashboard/chat-llm |
NEXT_PUBLIC_AI_VISION_ENABLED | true | Image analysis in LLM Chat (requires LLM Chat enabled) |
NEXT_PUBLIC_AI_AUDIO_INPUT_ENABLED | true | Voice input via speech-to-text in LLM Chat (requires LLM Chat enabled) |
NEXT_PUBLIC_AI_PDF_CHAT_ENABLED | true | PDF analysis in LLM Chat (requires LLM Chat enabled) |
NEXT_PUBLIC_AI_IMAGE_GEN_ENABLED | true | Image Generation on /dashboard/image-gen |
NEXT_PUBLIC_AI_CONTENT_GEN_ENABLED | true | Content Generator on /dashboard/content |
When Vision Chat is enabled, users can attach images to LLM Chat messages via drag & drop, clipboard paste, or file picker. Images are sent as
ContentPart[] (Base64 data URIs) to /api/ai/stream, which auto-selects the image_analysis credit operation (30 credits). See Chat System for details.When Audio Input is enabled, a microphone button appears in the LLM Chat input area. Users can record voice messages (up to 120 seconds) which are transcribed via the Whisper API at
/api/ai/speech-to-text (20 credits per transcription). The transcribed text is inserted into the chat input field. See Chat System for details.When Image Generation is enabled, the
/dashboard/image-gen route provides a text-to-image interface using OpenAI's GPT Image models (gpt-image-1, gpt-image-1.5, gpt-image-1-mini). Users can configure size, quality, format, and background transparency. Generated images are stored in session history (up to 10 entries). Unlike chat features, Image Generation is a standalone feature — it does NOT require LLM Chat to be enabled.When Content Generator is enabled, the
/dashboard/content route provides a template-based text generation interface with five templates (email, product description, blog outline, social media, marketing copy). Users can configure tone, language, and length. The generator uses SSE streaming to deliver results progressively. Like Image Generation, the Content Generator is a standalone feature — it does NOT require LLM Chat to be enabled.Feature flags are checked at two levels:
- Page level —
shouldShowRAGChat()/shouldShowLLMChat()/shouldShowImageGen()/shouldShowContentGen()guard functions callnotFound()if disabled - API level —
guardRAGChat()/guardLLMChat()/guardAudioInput()/guardImageGen()/guardContentGen()return 404 responses for disabled features
Directory Structure
All AI-related code lives in
apps/boilerplate/src/lib/ai/ with API routes in apps/boilerplate/src/app/api/ai/:apps/boilerplate/src/
├── lib/
│ └── ai/
│ ├── config.ts # Central config, models, rate limits, env validation
│ ├── types.ts # Shared TypeScript types (Message, Provider, etc.)
│ ├── feature-flags.ts # AI_CHAT_FEATURES, guard functions
│ ├── route-guards.ts # API + page guards for feature flags
│ ├── ai-service.ts # High-level service (wraps provider factory)
│ ├── provider-factory.ts # Creates/caches provider instances
│ ├── providers/
│ │ ├── base-provider.ts # Abstract class with retry logic
│ │ ├── openai.ts # OpenAI implementation
│ │ ├── anthropic.ts # Anthropic implementation
│ │ ├── google.ts # Google AI implementation
│ │ └── xai.ts # xAI implementation
│ ├── rag-service.ts # RAG pipeline (search → context → answer)
│ ├── rag-search.ts # pgvector similarity search
│ ├── rate-limiter.ts # Global burst + tier-based limiting
│ ├── usage-tracker.ts # Token/cost tracking to database
│ ├── image-gen/
│ │ ├── config.ts # Model configs, sizes, quality options
│ │ ├── service.ts # OpenAI image generation service
│ │ └── types.ts # Image generation TypeScript types
│ ├── content-gen/
│ │ ├── config.ts # Template definitions, prompt builder, UI labels
│ │ ├── service.ts # Content generation AI service wrapper
│ │ └── types.ts # Content generator TypeScript types
│ ├── sse-parser.ts # Shared SSE stream parser with error handling
│ ├── quick-prompts.ts # Configurable suggestion buttons
│ └── errors.ts # Error class hierarchy
├── hooks/
│ ├── use-ai.ts # React hooks (useAIChat, useAICompletion, etc.)
│ ├── use-image-gen.ts # Image generation hook with history
│ ├── use-content-generator.ts # Content generator hook with SSE streaming
│ └── use-audio-recorder.ts # Audio recording hook (MediaRecorder API)
├── app/
│ └── api/
│ └── ai/
│ ├── stream/route.ts # POST — SSE streaming endpoint
│ ├── chat/route.ts # POST — Synchronous chat endpoint
│ ├── speech-to-text/route.ts # POST — Audio transcription (Whisper)
│ ├── image-gen/route.ts # POST — Image generation endpoint
│ ├── generate-content/route.ts # POST — Content generation endpoint
│ ├── usage/route.ts # GET — Usage statistics endpoint
│ └── rag/
│ ├── ask/route.ts # POST — RAG question answering
│ └── conversations/ # CRUD for conversation history
└── components/
└── ai/
├── chat/ # Chat UI components (12 components)
├── image-gen/ # Image generation UI (4 components)
└── content-gen/ # Content generator UI (5 components)
Environment Variables
| Variable | Required | Purpose |
|---|---|---|
AI_PROVIDER | No | Force a specific provider (openai, anthropic, google, xai) |
AI_MODEL | No | Override default model for the active provider |
AI_API_KEY | No | Provider-neutral API key — works with any AI_PROVIDER (provider-specific keys take priority) |
OPENAI_API_KEY | Yes* | OpenAI API key (also required for RAG embeddings) |
ANTHROPIC_API_KEY | Yes* | Anthropic API key |
GOOGLE_AI_API_KEY | Yes* | Google AI API key |
XAI_API_KEY | Yes* | xAI API key |
AI_EMBEDDING_MODEL | No | Embedding model for RAG (default: text-embedding-3-small) |
NEXT_PUBLIC_AI_RAG_CHAT_ENABLED | No | Enable RAG Chat (default: true) |
NEXT_PUBLIC_AI_LLM_CHAT_ENABLED | No | Enable LLM Chat (default: true) |
NEXT_PUBLIC_AI_VISION_ENABLED | No | Enable image analysis in LLM Chat (default: true) |
NEXT_PUBLIC_AI_AUDIO_INPUT_ENABLED | No | Enable voice input in LLM Chat (default: true) |
NEXT_PUBLIC_AI_PDF_CHAT_ENABLED | No | Enable PDF analysis in LLM Chat (default: true) |
NEXT_PUBLIC_AI_IMAGE_GEN_ENABLED | No | Enable Image Generation (default: true) |
NEXT_PUBLIC_AI_CONTENT_GEN_ENABLED | No | Enable Content Generator (default: true) |
UPSTASH_REDIS_REST_URL | No | Redis URL for rate limiting |
UPSTASH_REDIS_REST_TOKEN | No | Redis token for rate limiting |
*At least one provider API key is required. The system auto-detects the provider from available keys.
To get AI working, you only need one API key. The simplest setup: set
AI_PROVIDER and AI_API_KEY — one key for everything. Alternatively, use provider-specific keys like ANTHROPIC_API_KEY for Claude. For RAG Chat, also add OPENAI_API_KEY — OpenAI is required for embeddings (text-embedding-3-small). Add more provider keys later to enable fallback and provider selection.Key Files
| File | Purpose |
|---|---|
apps/boilerplate/src/lib/ai/config.ts | Central configuration — models, rate limits, provider detection, env validation |
apps/boilerplate/src/lib/ai/feature-flags.ts | Feature flag definitions and guard functions |
apps/boilerplate/src/lib/ai/ai-service.ts | High-level AI service (wraps provider factory, calculates costs) |
apps/boilerplate/src/lib/ai/provider-factory.ts | Creates, caches, and selects provider instances |
apps/boilerplate/src/lib/ai/rag-service.ts | RAG pipeline — query rewriting, search, context assembly, answer generation |
apps/boilerplate/src/lib/ai/rag-search.ts | pgvector similarity search with OpenAI embeddings |
apps/boilerplate/src/lib/ai/rate-limiter.ts | Two-layer rate limiting (global burst, tier-based) |
apps/boilerplate/src/lib/credits/credit-costs.ts | Per-operation credit costs (21 operation types) |
apps/boilerplate/src/hooks/use-ai.ts | React hooks — useAIChat, useAICompletion, useAIQuery, useAIStream |
apps/boilerplate/src/app/api/ai/stream/route.ts | SSE streaming endpoint with full cost management pipeline |