AI Integration Overview

Kit ships with a production-ready AI system that supports four LLM providers (Anthropic, OpenAI, Google, xAI) through a unified Strategy Pattern interface. The system includes two chat modes, streaming SSE responses, pgvector-powered RAG, and a three-layer cost management system.

This page covers the architecture and core concepts. For provider setup, see AI Providers. For the chat system, see Chat System. For knowledge base search, see RAG System. For rate limiting and credits, see Cost Management.

How It Works

Every AI request in Kit flows through the same pipeline — from the React hook to the provider and back:

User types message
    |
    v
React Hook (useAIChat / useAICompletion)
    |--- Manages message history
    |--- Handles streaming state
    |--- Triggers credit animation
    |
    v
API Route (/api/ai/stream or /api/ai/chat)
    |--- 1. Feature guard (is chat mode enabled?)
    |--- 2. Authentication (Clerk → DB user)
    |--- 3. Rate limit check (global burst + credit balance)
    |--- 4. Credit deduction (BEFORE processing)
    |--- 5. Zod request validation
    |
    v
AI Service (ai-service.ts)
    |--- Resolves model aliases
    |--- Creates provider instance (cached)
    |--- Routes to correct provider
    |
    v
Provider (OpenAI / Anthropic / Google / xAI)
    |--- Sends request to provider API
    |--- Handles retries with exponential backoff
    |--- Streams response chunks via SSE
    |
    v
Response flows back
    |--- Usage tracked to database (non-blocking)
    |--- Credit balance invalidated in TanStack Query cache
    |--- Message displayed in chat UI

Provider Architecture

Kit uses the Strategy Pattern for AI providers. Each provider implements the same BaseProvider abstract class, so switching providers requires zero code changes — only an environment variable update.

The system auto-detects which provider to use based on available API keys:

src/lib/ai/config.ts — Auto-Detection

export function getActiveProvider(): AIProvider | null {
  // If provider is explicitly set, use it (AI_API_KEY can serve as its key)
  if (aiConfig.AI_PROVIDER) {
    return aiConfig.AI_PROVIDER
  }

  // Auto-detect based on available API keys (Anthropic preferred)
  if (aiConfig.ANTHROPIC_API_KEY) return 'anthropic'
  if (aiConfig.OPENAI_API_KEY) return 'openai'
  if (aiConfig.GOOGLE_AI_API_KEY) return 'google'
  if (aiConfig.XAI_API_KEY) return 'xai'

  return null
}

Each provider has a preconfigured default model optimized for cost-efficiency:

src/lib/ai/config.ts — Default Models

export const DEFAULT_MODELS: Record<AIProvider, string> = {
  openai: 'gpt-5-nano',
  anthropic: 'claude-haiku-4-5-20251001',
  google: 'gemini-2.5-flash',
  xai: 'grok-4-1-fast-reasoning',
}

Provider	Default Model	Context Window	Best For
Anthropic	`claude-haiku-4-5`	200K tokens (1M beta)	Primary — nuanced reasoning, long context
OpenAI	`gpt-5-nano`	400K tokens	General purpose, RAG embeddings
Google	`gemini-2.5-flash`	1M tokens	Large documents, cost efficiency
xAI	`grok-4-1-fast-reasoning`	2M tokens	Real-time data, conversational

Kit uses the Vercel AI SDK (ai v4.3.x) for embeddings and the useCompletion hook. The streaming chat (useAIChat) uses a custom SSE parser that supports five response formats across all providers — this provides more robust multi-provider support than the SDK alone.

Two Chat Modes

Kit provides two distinct chat experiences, each with its own route, API, and UI:

Aspect	LLM Chat	RAG Chat
Route	`/dashboard/chat-llm`	`/dashboard/chat-rag`
API	`/api/ai/stream`, `/api/ai/chat`	`/api/ai/rag/ask`
Hook	`useAIChat()`	Custom RAG hook
Context	Direct LLM conversation	Knowledge base + LLM
Feature Flag	`NEXT_PUBLIC_AI_LLM_CHAT_ENABLED`	`NEXT_PUBLIC_AI_RAG_CHAT_ENABLED`
Token Usage	Full conversation history	~3-5K tokens (RAG context)
Best For	Open-ended conversation, coding help	Product support, FAQ

Both chat modes are enabled by default. Set the corresponding environment variable to false to disable either mode. The navigation automatically hides disabled chat modes.

Feature Flags

Seven environment variables control which AI features are available. All default to true (enabled):

src/lib/ai/feature-flags.ts — Feature Configuration

export const AI_CHAT_FEATURES = {
  /**
   * RAG Chat (Modern UI)
   * Routes: /dashboard/chat-rag, /api/ai/rag/*
   * Features: Modern chat UI, Knowledge Base integration, Source Attribution
   */
  ragChat: process.env.NEXT_PUBLIC_AI_RAG_CHAT_ENABLED !== 'false',

  /**
   * LLM Chat (Direct Chat)
   * Routes: /dashboard/chat-llm, /api/ai/chat, /api/ai/stream
   * Features: Modern chat UI, Direct LLM conversation, Streaming
   */
  llmChat: process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false',

  /**
   * Vision Chat (Image Analysis in LLM Chat)
   * Extends LLM Chat with image upload and analysis capabilities.
   * Requires LLM Chat to be enabled. Only active when BOTH flags are true.
   * Features: Drag & Drop, Paste, File picker, Base64 image transport
   */
  visionChat:
    process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false' &&
    process.env.NEXT_PUBLIC_AI_VISION_ENABLED !== 'false',

  /**
   * PDF Chat (Document Analysis in LLM Chat)
   * Extends LLM Chat with PDF upload and text extraction capabilities.
   * Requires LLM Chat to be enabled. Only active when BOTH flags are true.
   * Features: Drag & Drop, File picker, server-side text extraction, all providers
   */
  pdfChat:
    process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false' &&
    process.env.NEXT_PUBLIC_AI_PDF_CHAT_ENABLED !== 'false',

  /**
   * Audio Input (Speech-to-Text in LLM Chat)
   * Extends LLM Chat with microphone recording and Whisper transcription.
   * Requires LLM Chat to be enabled. Only active when BOTH flags are true.
   * Features: MediaRecorder, Whisper STT, editable transcript in input field
   */
  audioInput:
    process.env.NEXT_PUBLIC_AI_LLM_CHAT_ENABLED !== 'false' &&
    process.env.NEXT_PUBLIC_AI_AUDIO_INPUT_ENABLED !== 'false',

  /**
   * Image Generation (Text-to-Image)
   * Routes: /dashboard/image-gen, /api/ai/image-gen
   * Features: GPT Image models, multiple sizes/qualities/formats, transparent backgrounds
   * Standalone feature — does NOT require LLM Chat to be enabled.
   */
  imageGen: process.env.NEXT_PUBLIC_AI_IMAGE_GEN_ENABLED !== 'false',

  /**
   * Content Generator (Template-based Text Generation)
   * Routes: /dashboard/content, /api/ai/generate-content
   * Features: 5 templates (Email, Product, Blog, Social, Marketing), tone/language/length controls, streaming output
   * Standalone feature — does NOT require LLM Chat to be enabled.
   */
  contentGen: process.env.NEXT_PUBLIC_AI_CONTENT_GEN_ENABLED !== 'false',

} as const

Variable	Default	Controls
`NEXT_PUBLIC_AI_RAG_CHAT_ENABLED`	`true`	RAG Chat on `/dashboard/chat-rag`
`NEXT_PUBLIC_AI_LLM_CHAT_ENABLED`	`true`	LLM Chat on `/dashboard/chat-llm`
`NEXT_PUBLIC_AI_VISION_ENABLED`	`true`	Image analysis in LLM Chat (requires LLM Chat enabled)
`NEXT_PUBLIC_AI_AUDIO_INPUT_ENABLED`	`true`	Voice input via speech-to-text in LLM Chat (requires LLM Chat enabled)
`NEXT_PUBLIC_AI_PDF_CHAT_ENABLED`	`true`	PDF analysis in LLM Chat (requires LLM Chat enabled)
`NEXT_PUBLIC_AI_IMAGE_GEN_ENABLED`	`true`	Image Generation on `/dashboard/image-gen`
`NEXT_PUBLIC_AI_CONTENT_GEN_ENABLED`	`true`	Content Generator on `/dashboard/content`

When Vision Chat is enabled, users can attach images to LLM Chat messages via drag & drop, clipboard paste, or file picker. Images are sent as ContentPart[] (Base64 data URIs) to /api/ai/stream, which auto-selects the image_analysis credit operation (30 credits). See Chat System for details.

When Audio Input is enabled, a microphone button appears in the LLM Chat input area. Users can record voice messages (up to 120 seconds) which are transcribed via the Whisper API at /api/ai/speech-to-text (20 credits per transcription). The transcribed text is inserted into the chat input field. See Chat System for details.

When Image Generation is enabled, the /dashboard/image-gen route provides a text-to-image interface using OpenAI's GPT Image models (gpt-image-1, gpt-image-1.5, gpt-image-1-mini). Users can configure size, quality, format, and background transparency. Generated images are stored in session history (up to 10 entries). Unlike chat features, Image Generation is a standalone feature — it does NOT require LLM Chat to be enabled.

When Content Generator is enabled, the /dashboard/content route provides a template-based text generation interface with five templates (email, product description, blog outline, social media, marketing copy). Users can configure tone, language, and length. The generator uses SSE streaming to deliver results progressively. Like Image Generation, the Content Generator is a standalone feature — it does NOT require LLM Chat to be enabled.

Feature flags are checked at two levels:

Page level — shouldShowRAGChat() / shouldShowLLMChat() / shouldShowImageGen() / shouldShowContentGen() guard functions call notFound() if disabled
API level — guardRAGChat() / guardLLMChat() / guardAudioInput() / guardImageGen() / guardContentGen() return 404 responses for disabled features

Directory Structure

All AI-related code lives in apps/boilerplate/src/lib/ai/ with API routes in apps/boilerplate/src/app/api/ai/:

apps/boilerplate/src/
├── lib/
│   └── ai/
│       ├── config.ts            # Central config, models, rate limits, env validation
│       ├── types.ts             # Shared TypeScript types (Message, Provider, etc.)
│       ├── feature-flags.ts     # AI_CHAT_FEATURES, guard functions
│       ├── route-guards.ts      # API + page guards for feature flags
│       ├── ai-service.ts        # High-level service (wraps provider factory)
│       ├── provider-factory.ts  # Creates/caches provider instances
│       ├── providers/
│       │   ├── base-provider.ts # Abstract class with retry logic
│       │   ├── openai.ts        # OpenAI implementation
│       │   ├── anthropic.ts     # Anthropic implementation
│       │   ├── google.ts        # Google AI implementation
│       │   └── xai.ts           # xAI implementation
│       ├── rag-service.ts       # RAG pipeline (search → context → answer)
│       ├── rag-search.ts        # pgvector similarity search
│       ├── rate-limiter.ts      # Global burst + tier-based limiting
│       ├── usage-tracker.ts     # Token/cost tracking to database
│       ├── image-gen/
│       │   ├── config.ts        # Model configs, sizes, quality options
│       │   ├── service.ts       # OpenAI image generation service
│       │   └── types.ts         # Image generation TypeScript types
│       ├── content-gen/
│       │   ├── config.ts        # Template definitions, prompt builder, UI labels
│       │   ├── service.ts       # Content generation AI service wrapper
│       │   └── types.ts         # Content generator TypeScript types
│       ├── sse-parser.ts        # Shared SSE stream parser with error handling
│       ├── quick-prompts.ts     # Configurable suggestion buttons
│       └── errors.ts            # Error class hierarchy
├── hooks/
│   ├── use-ai.ts               # React hooks (useAIChat, useAICompletion, etc.)
│   ├── use-image-gen.ts        # Image generation hook with history
│   ├── use-content-generator.ts # Content generator hook with SSE streaming
│   └── use-audio-recorder.ts   # Audio recording hook (MediaRecorder API)
├── app/
│   └── api/
│       └── ai/
│           ├── stream/route.ts          # POST — SSE streaming endpoint
│           ├── chat/route.ts            # POST — Synchronous chat endpoint
│           ├── speech-to-text/route.ts  # POST — Audio transcription (Whisper)
│           ├── image-gen/route.ts       # POST — Image generation endpoint
│           ├── generate-content/route.ts # POST — Content generation endpoint
│           ├── usage/route.ts           # GET — Usage statistics endpoint
│           └── rag/
│               ├── ask/route.ts         # POST — RAG question answering
│               └── conversations/       # CRUD for conversation history
└── components/
    └── ai/
        ├── chat/                # Chat UI components (12 components)
        ├── image-gen/           # Image generation UI (4 components)
        └── content-gen/         # Content generator UI (5 components)

Environment Variables

Variable	Required	Purpose
`AI_PROVIDER`	No	Force a specific provider (`openai`, `anthropic`, `google`, `xai`)
`AI_MODEL`	No	Override default model for the active provider
`AI_API_KEY`	No	Provider-neutral API key — works with any `AI_PROVIDER` (provider-specific keys take priority)
`OPENAI_API_KEY`	Yes*	OpenAI API key (also required for RAG embeddings)
`ANTHROPIC_API_KEY`	Yes*	Anthropic API key
`GOOGLE_AI_API_KEY`	Yes*	Google AI API key
`XAI_API_KEY`	Yes*	xAI API key
`AI_EMBEDDING_MODEL`	No	Embedding model for RAG (default: `text-embedding-3-small`)
`NEXT_PUBLIC_AI_RAG_CHAT_ENABLED`	No	Enable RAG Chat (default: `true`)
`NEXT_PUBLIC_AI_LLM_CHAT_ENABLED`	No	Enable LLM Chat (default: `true`)
`NEXT_PUBLIC_AI_VISION_ENABLED`	No	Enable image analysis in LLM Chat (default: `true`)
`NEXT_PUBLIC_AI_AUDIO_INPUT_ENABLED`	No	Enable voice input in LLM Chat (default: `true`)
`NEXT_PUBLIC_AI_PDF_CHAT_ENABLED`	No	Enable PDF analysis in LLM Chat (default: `true`)
`NEXT_PUBLIC_AI_IMAGE_GEN_ENABLED`	No	Enable Image Generation (default: `true`)
`NEXT_PUBLIC_AI_CONTENT_GEN_ENABLED`	No	Enable Content Generator (default: `true`)
`UPSTASH_REDIS_REST_URL`	No	Redis URL for rate limiting
`UPSTASH_REDIS_REST_TOKEN`	No	Redis token for rate limiting

*At least one provider API key is required. The system auto-detects the provider from available keys.

To get AI working, you only need one API key. The simplest setup: set AI_PROVIDER and AI_API_KEY — one key for everything. Alternatively, use provider-specific keys like ANTHROPIC_API_KEY for Claude. For RAG Chat, also add OPENAI_API_KEY — OpenAI is required for embeddings (text-embedding-3-small). Add more provider keys later to enable fallback and provider selection.

Key Files

File	Purpose
`apps/boilerplate/src/lib/ai/config.ts`	Central configuration — models, rate limits, provider detection, env validation
`apps/boilerplate/src/lib/ai/feature-flags.ts`	Feature flag definitions and guard functions
`apps/boilerplate/src/lib/ai/ai-service.ts`	High-level AI service (wraps provider factory, calculates costs)
`apps/boilerplate/src/lib/ai/provider-factory.ts`	Creates, caches, and selects provider instances
`apps/boilerplate/src/lib/ai/rag-service.ts`	RAG pipeline — query rewriting, search, context assembly, answer generation
`apps/boilerplate/src/lib/ai/rag-search.ts`	pgvector similarity search with OpenAI embeddings
`apps/boilerplate/src/lib/ai/rate-limiter.ts`	Two-layer rate limiting (global burst, tier-based)
`apps/boilerplate/src/lib/credits/credit-costs.ts`	Per-operation credit costs (21 operation types)
`apps/boilerplate/src/hooks/use-ai.ts`	React hooks — `useAIChat`, `useAICompletion`, `useAIQuery`, `useAIStream`
`apps/boilerplate/src/app/api/ai/stream/route.ts`	SSE streaming endpoint with full cost management pipeline