AI Integration Overview

Multi-provider AI system with OpenAI, Anthropic, Google, and xAI — architecture, feature flags, and directory structure

Kit ships with a production-ready AI system that supports four LLM providers (Anthropic, OpenAI, Google, xAI) through a unified Strategy Pattern interface. The system includes two chat modes, streaming SSE responses, pgvector-powered RAG, and a three-layer cost management system.
This page covers the architecture and core concepts. For provider setup, see AI Providers. For the chat system, see Chat System. For knowledge base search, see RAG System. For rate limiting and credits, see Cost Management.

How It Works

Every AI request in Kit flows through the same pipeline — from the React hook to the provider and back:
User types message
    |
    v
React Hook (useAIChat / useAICompletion)
    |--- Manages message history
    |--- Handles streaming state
    |--- Triggers credit animation
    |
    v
API Route (/api/ai/stream or /api/ai/chat)
    |--- 1. Feature guard (is chat mode enabled?)
    |--- 2. Authentication (Clerk → DB user)
    |--- 3. Rate limit check (global burst + credit balance)
    |--- 4. Credit deduction (BEFORE processing)
    |--- 5. Zod request validation
    |
    v
AI Service (ai-service.ts)
    |--- Resolves model aliases
    |--- Creates provider instance (cached)
    |--- Routes to correct provider
    |
    v
Provider (OpenAI / Anthropic / Google / xAI)
    |--- Sends request to provider API
    |--- Handles retries with exponential backoff
    |--- Streams response chunks via SSE
    |
    v
Response flows back
    |--- Usage tracked to database (non-blocking)
    |--- Credit balance invalidated in TanStack Query cache
    |--- Message displayed in chat UI

Provider Architecture

Kit uses the Strategy Pattern for AI providers. Each provider implements the same BaseProvider abstract class, so switching providers requires zero code changes — only an environment variable update.
The system auto-detects which provider to use based on available API keys:
src/lib/ai/config.ts — Auto-Detection
export function getActiveProvider(): AIProvider | null {
  // If provider is explicitly set, use it (AI_API_KEY can serve as its key)
  if (aiConfig.AI_PROVIDER) {
    return aiConfig.AI_PROVIDER
  }

  // Auto-detect based on available API keys (Anthropic preferred)
  if (aiConfig.ANTHROPIC_API_KEY) return 'anthropic'
  if (aiConfig.OPENAI_API_KEY) return 'openai'
  if (aiConfig.GOOGLE_AI_API_KEY) return 'google'
  if (aiConfig.XAI_API_KEY) return 'xai'

  return null
}
Each provider has a preconfigured default model optimized for cost-efficiency:
src/lib/ai/config.ts — Default Models
export const DEFAULT_MODELS: Record<AIProvider, string> = {
  openai: 'gpt-5-nano',
  anthropic: 'claude-haiku-4-5-20251001',
  google: 'gemini-2.5-flash',
  xai: 'grok-4-1-fast-reasoning',
}
ProviderDefault ModelContext WindowBest For
Anthropicclaude-haiku-4-5200K tokens (1M beta)Primary — nuanced reasoning, long context
OpenAIgpt-5-nano400K tokensGeneral purpose, RAG embeddings
Googlegemini-2.5-flash1M tokensLarge documents, cost efficiency
xAIgrok-4-1-fast-reasoning2M tokensReal-time data, conversational

Two Chat Modes

Kit provides two distinct chat experiences, each with its own route, API, and UI:
AspectLLM ChatRAG Chat
Route/dashboard/chat-llm/dashboard/chat-rag
API/api/ai/stream, /api/ai/chat/api/ai/rag/ask
HookuseAIChat()Custom RAG hook
ContextDirect LLM conversationKnowledge base + LLM
Feature FlagNEXT_PUBLIC_AI_LLM_CHAT_ENABLEDNEXT_PUBLIC_AI_RAG_CHAT_ENABLED
Token UsageFull conversation history~3-5K tokens (RAG context)
Best ForOpen-ended conversation, coding helpProduct support, FAQ

Feature Flags

Seven environment variables control which AI features are available. All default to true (enabled):
src/lib/ai/feature-flags.ts — Feature Configuration
export const AI_CHAT_FEATURES = {
  /**
   * RAG Chat (Modern UI)
   * Routes: /dashboard/chat-rag, /api/ai/rag/*
   * Features: Modern chat UI, Knowledge Base integration, Source Attribution
   */
  ragChat: process.env.NEXT_PUBLIC_AI_RAG_CHAT_ENABLED !== 'false',

  /**
   * LLM Chat (Direct Chat)
   * Routes: /dashboard/chat-llm, /api/ai/chat, /api/ai/stream
   * Features: Modern chat UI, Direct LLM conversation, Streaming
   */
  llmChat: process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false',

  /**
   * Vision Chat (Image Analysis in LLM Chat)
   * Extends LLM Chat with image upload and analysis capabilities.
   * Requires LLM Chat to be enabled. Only active when BOTH flags are true.
   * Features: Drag & Drop, Paste, File picker, Base64 image transport
   */
  visionChat:
    process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false' &&
    process.env.NEXT_PUBLIC_AI_VISION_ENABLED !== 'false',

  /**
   * PDF Chat (Document Analysis in LLM Chat)
   * Extends LLM Chat with PDF upload and text extraction capabilities.
   * Requires LLM Chat to be enabled. Only active when BOTH flags are true.
   * Features: Drag & Drop, File picker, server-side text extraction, all providers
   */
  pdfChat:
    process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false' &&
    process.env.NEXT_PUBLIC_AI_PDF_CHAT_ENABLED !== 'false',

  /**
   * Audio Input (Speech-to-Text in LLM Chat)
   * Extends LLM Chat with microphone recording and Whisper transcription.
   * Requires LLM Chat to be enabled. Only active when BOTH flags are true.
   * Features: MediaRecorder, Whisper STT, editable transcript in input field
   */
  audioInput:
    process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false' &&
    process.env.NEXT_PUBLIC_AI_AUDIO_INPUT_ENABLED !== 'false',

  /**
   * Image Generation (Text-to-Image)
   * Routes: /dashboard/image-gen, /api/ai/image-gen
   * Features: GPT Image models, multiple sizes/qualities/formats, transparent backgrounds
   * Standalone feature — does NOT require LLM Chat to be enabled.
   */
  imageGen: process.env.NEXT_PUBLIC_AI_IMAGE_GEN_ENABLED !== 'false',

  /**
   * Content Generator (Template-based Text Generation)
   * Routes: /dashboard/content, /api/ai/generate-content
   * Features: 5 templates (Email, Product, Blog, Social, Marketing), tone/language/length controls, streaming output
   * Standalone feature — does NOT require LLM Chat to be enabled.
   */
  contentGen: process.env.NEXT_PUBLIC_AI_CONTENT_GEN_ENABLED !== 'false',

} as const
VariableDefaultControls
NEXT_PUBLIC_AI_RAG_CHAT_ENABLEDtrueRAG Chat on /dashboard/chat-rag
NEXT_PUBLIC_AI_LLM_CHAT_ENABLEDtrueLLM Chat on /dashboard/chat-llm
NEXT_PUBLIC_AI_VISION_ENABLEDtrueImage analysis in LLM Chat (requires LLM Chat enabled)
NEXT_PUBLIC_AI_AUDIO_INPUT_ENABLEDtrueVoice input via speech-to-text in LLM Chat (requires LLM Chat enabled)
NEXT_PUBLIC_AI_PDF_CHAT_ENABLEDtruePDF analysis in LLM Chat (requires LLM Chat enabled)
NEXT_PUBLIC_AI_IMAGE_GEN_ENABLEDtrueImage Generation on /dashboard/image-gen
NEXT_PUBLIC_AI_CONTENT_GEN_ENABLEDtrueContent Generator on /dashboard/content
When Vision Chat is enabled, users can attach images to LLM Chat messages via drag & drop, clipboard paste, or file picker. Images are sent as ContentPart[] (Base64 data URIs) to /api/ai/stream, which auto-selects the image_analysis credit operation (30 credits). See Chat System for details.
When Audio Input is enabled, a microphone button appears in the LLM Chat input area. Users can record voice messages (up to 120 seconds) which are transcribed via the Whisper API at /api/ai/speech-to-text (20 credits per transcription). The transcribed text is inserted into the chat input field. See Chat System for details.
When Image Generation is enabled, the /dashboard/image-gen route provides a text-to-image interface using OpenAI's GPT Image models (gpt-image-1, gpt-image-1.5, gpt-image-1-mini). Users can configure size, quality, format, and background transparency. Generated images are stored in session history (up to 10 entries). Unlike chat features, Image Generation is a standalone feature — it does NOT require LLM Chat to be enabled.
When Content Generator is enabled, the /dashboard/content route provides a template-based text generation interface with five templates (email, product description, blog outline, social media, marketing copy). Users can configure tone, language, and length. The generator uses SSE streaming to deliver results progressively. Like Image Generation, the Content Generator is a standalone feature — it does NOT require LLM Chat to be enabled.
Feature flags are checked at two levels:
  1. Page levelshouldShowRAGChat() / shouldShowLLMChat() / shouldShowImageGen() / shouldShowContentGen() guard functions call notFound() if disabled
  2. API levelguardRAGChat() / guardLLMChat() / guardAudioInput() / guardImageGen() / guardContentGen() return 404 responses for disabled features

Directory Structure

All AI-related code lives in apps/boilerplate/src/lib/ai/ with API routes in apps/boilerplate/src/app/api/ai/:
apps/boilerplate/src/
├── lib/
│   └── ai/
│       ├── config.ts            # Central config, models, rate limits, env validation
│       ├── types.ts             # Shared TypeScript types (Message, Provider, etc.)
│       ├── feature-flags.ts     # AI_CHAT_FEATURES, guard functions
│       ├── route-guards.ts      # API + page guards for feature flags
│       ├── ai-service.ts        # High-level service (wraps provider factory)
│       ├── provider-factory.ts  # Creates/caches provider instances
│       ├── providers/
│       │   ├── base-provider.ts # Abstract class with retry logic
│       │   ├── openai.ts        # OpenAI implementation
│       │   ├── anthropic.ts     # Anthropic implementation
│       │   ├── google.ts        # Google AI implementation
│       │   └── xai.ts           # xAI implementation
│       ├── rag-service.ts       # RAG pipeline (search → context → answer)
│       ├── rag-search.ts        # pgvector similarity search
│       ├── rate-limiter.ts      # Global burst + tier-based limiting
│       ├── usage-tracker.ts     # Token/cost tracking to database
│       ├── image-gen/
│       │   ├── config.ts        # Model configs, sizes, quality options
│       │   ├── service.ts       # OpenAI image generation service
│       │   └── types.ts         # Image generation TypeScript types
│       ├── content-gen/
│       │   ├── config.ts        # Template definitions, prompt builder, UI labels
│       │   ├── service.ts       # Content generation AI service wrapper
│       │   └── types.ts         # Content generator TypeScript types
│       ├── sse-parser.ts        # Shared SSE stream parser with error handling
│       ├── quick-prompts.ts     # Configurable suggestion buttons
│       └── errors.ts            # Error class hierarchy
├── hooks/
│   ├── use-ai.ts               # React hooks (useAIChat, useAICompletion, etc.)
│   ├── use-image-gen.ts        # Image generation hook with history
│   ├── use-content-generator.ts # Content generator hook with SSE streaming
│   └── use-audio-recorder.ts   # Audio recording hook (MediaRecorder API)
├── app/
│   └── api/
│       └── ai/
│           ├── stream/route.ts          # POST — SSE streaming endpoint
│           ├── chat/route.ts            # POST — Synchronous chat endpoint
│           ├── speech-to-text/route.ts  # POST — Audio transcription (Whisper)
│           ├── image-gen/route.ts       # POST — Image generation endpoint
│           ├── generate-content/route.ts # POST — Content generation endpoint
│           ├── usage/route.ts           # GET — Usage statistics endpoint
│           └── rag/
│               ├── ask/route.ts         # POST — RAG question answering
│               └── conversations/       # CRUD for conversation history
└── components/
    └── ai/
        ├── chat/                # Chat UI components (12 components)
        ├── image-gen/           # Image generation UI (4 components)
        └── content-gen/         # Content generator UI (5 components)

Environment Variables

VariableRequiredPurpose
AI_PROVIDERNoForce a specific provider (openai, anthropic, google, xai)
AI_MODELNoOverride default model for the active provider
AI_API_KEYNoProvider-neutral API key — works with any AI_PROVIDER (provider-specific keys take priority)
OPENAI_API_KEYYes*OpenAI API key (also required for RAG embeddings)
ANTHROPIC_API_KEYYes*Anthropic API key
GOOGLE_AI_API_KEYYes*Google AI API key
XAI_API_KEYYes*xAI API key
AI_EMBEDDING_MODELNoEmbedding model for RAG (default: text-embedding-3-small)
NEXT_PUBLIC_AI_RAG_CHAT_ENABLEDNoEnable RAG Chat (default: true)
NEXT_PUBLIC_AI_LLM_CHAT_ENABLEDNoEnable LLM Chat (default: true)
NEXT_PUBLIC_AI_VISION_ENABLEDNoEnable image analysis in LLM Chat (default: true)
NEXT_PUBLIC_AI_AUDIO_INPUT_ENABLEDNoEnable voice input in LLM Chat (default: true)
NEXT_PUBLIC_AI_PDF_CHAT_ENABLEDNoEnable PDF analysis in LLM Chat (default: true)
NEXT_PUBLIC_AI_IMAGE_GEN_ENABLEDNoEnable Image Generation (default: true)
NEXT_PUBLIC_AI_CONTENT_GEN_ENABLEDNoEnable Content Generator (default: true)
UPSTASH_REDIS_REST_URLNoRedis URL for rate limiting
UPSTASH_REDIS_REST_TOKENNoRedis token for rate limiting
*At least one provider API key is required. The system auto-detects the provider from available keys.

Key Files

FilePurpose
apps/boilerplate/src/lib/ai/config.tsCentral configuration — models, rate limits, provider detection, env validation
apps/boilerplate/src/lib/ai/feature-flags.tsFeature flag definitions and guard functions
apps/boilerplate/src/lib/ai/ai-service.tsHigh-level AI service (wraps provider factory, calculates costs)
apps/boilerplate/src/lib/ai/provider-factory.tsCreates, caches, and selects provider instances
apps/boilerplate/src/lib/ai/rag-service.tsRAG pipeline — query rewriting, search, context assembly, answer generation
apps/boilerplate/src/lib/ai/rag-search.tspgvector similarity search with OpenAI embeddings
apps/boilerplate/src/lib/ai/rate-limiter.tsTwo-layer rate limiting (global burst, tier-based)
apps/boilerplate/src/lib/credits/credit-costs.tsPer-operation credit costs (21 operation types)
apps/boilerplate/src/hooks/use-ai.tsReact hooks — useAIChat, useAICompletion, useAIQuery, useAIStream
apps/boilerplate/src/app/api/ai/stream/route.tsSSE streaming endpoint with full cost management pipeline