Build an AI Chatbot with SvelteKit

Step-by-step guide to building an AI chatbot with SvelteKit, covering API integration, streaming responses, chat UI, and conversation history.

AI chatbots have become one of the most common features in modern web applications. Whether you are building a customer support assistant, a coding helper, or a conversational interface for your product, the core pattern is the same: accept user input, send it to an AI model, stream the response back, and display it in a chat interface. SvelteKit is an excellent framework for building chatbots because its server endpoints handle the API communication, its streaming support delivers responses token by token, and Svelte's reactive UI makes the chat interface smooth and responsive. This guide walks through building a complete AI chatbot from scratch with SvelteKit, covering API integration, streaming, the chat UI, and conversation persistence.

Overview

The chatbot we are building has these features:

A chat interface with message history
Streaming responses from an AI model (OpenAI or Anthropic)
Server-side API route that proxies requests to the AI provider
Conversation history stored in memory (with notes on persistent storage)
Markdown rendering for AI responses

The architecture is straightforward: the SvelteKit frontend sends messages to a +server.ts API endpoint, which forwards them to the AI provider's API and streams the response back to the browser.

Setting Up the Project

Start with a new SvelteKit project:

npx sv create chatbot
cd chatbot
npm install

Install the AI provider's SDK. We will use OpenAI's SDK, which works with both OpenAI and compatible APIs:

npm install openai

For markdown rendering in chat messages:

npm install marked

Add your API key to .env:

OPENAI_API_KEY=sk-...

Connecting to the AI API

Create a server endpoint that receives messages and returns a streaming response.

// src/routes/api/chat/+server.ts
import { OPENAI_API_KEY } from '$env/static/private';
import OpenAI from 'openai';

const client = new OpenAI({ apiKey: OPENAI_API_KEY });

export async function POST({ request }) {
  const { messages } = await request.json();

  const stream = await client.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      {
        role: 'system',
        content: 'You are a helpful assistant. Be concise and clear in your responses.'
      },
      ...messages
    ],
    stream: true
  });

  // Convert the OpenAI stream to a ReadableStream
  const readable = new ReadableStream({
    async start(controller) {
      const encoder = new TextEncoder();

      for await (const chunk of stream) {
        const content = chunk.choices[0]?.delta?.content;
        if (content) {
          controller.enqueue(encoder.encode(content));
        }
      }

      controller.close();
    }
  });

  return new Response(readable, {
    headers: {
      'Content-Type': 'text/plain; charset=utf-8',
      'Transfer-Encoding': 'chunked'
    }
  });
}

This endpoint accepts an array of messages (the conversation history), sends them to OpenAI with streaming enabled, and pipes the token stream back to the client as a ReadableStream.

Using Anthropic's Claude Instead

If you prefer Claude, the pattern is similar:

// src/routes/api/chat/+server.ts
import { ANTHROPIC_API_KEY } from '$env/static/private';
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({ apiKey: ANTHROPIC_API_KEY });

export async function POST({ request }) {
  const { messages } = await request.json();

  const stream = await client.messages.stream({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 4096,
    system: 'You are a helpful assistant. Be concise and clear.',
    messages
  });

  const readable = new ReadableStream({
    async start(controller) {
      const encoder = new TextEncoder();

      for await (const event of stream) {
        if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
          controller.enqueue(encoder.encode(event.delta.text));
        }
      }

      controller.close();
    }
  });

  return new Response(readable, {
    headers: {
      'Content-Type': 'text/plain; charset=utf-8',
      'Transfer-Encoding': 'chunked'
    }
  });
}

Streaming Responses

The key to a good chatbot experience is streaming. Instead of waiting for the entire response (which can take 5-15 seconds for long answers), you show each token as it arrives. This makes the chatbot feel alive and responsive.

On the client side, read the streaming response using a ReadableStream reader:

// src/lib/chat.ts
export async function sendMessage(
  messages: { role: string; content: string }[],
  onToken: (token: string) => void
) {
  const response = await fetch('/api/chat', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ messages })
  });

  if (!response.ok) {
    throw new Error(`Chat error: ${response.status}`);
  }

  const reader = response.body!.getReader();
  const decoder = new TextDecoder();
  let fullResponse = '';

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    const text = decoder.decode(value, { stream: true });
    fullResponse += text;
    onToken(text);
  }

  return fullResponse;
}

The onToken callback fires for each chunk of text, allowing the UI to update progressively.

Building the Chat UI

Now let's build the chat interface. This is where Svelte 5's reactivity shines — the UI updates smoothly as tokens stream in.

<!-- src/routes/+page.svelte -->
<script lang="ts">
  import { sendMessage } from '$lib/chat';
  import { marked } from 'marked';
  import { tick } from 'svelte';

  interface Message {
    role: 'user' | 'assistant';
    content: string;
  }

  let messages = $state<Message[]>([]);
  let input = $state('');
  let isLoading = $state(false);
  let chatContainer: HTMLElement;

  async function scrollToBottom() {
    await tick();
    if (chatContainer) {
      chatContainer.scrollTop = chatContainer.scrollHeight;
    }
  }

  async function handleSubmit(e: Event) {
    e.preventDefault();
    if (!input.trim() || isLoading) return;

    const userMessage: Message = { role: 'user', content: input.trim() };
    messages.push(userMessage);
    input = '';
    isLoading = true;

    // Add an empty assistant message that we will fill with streaming tokens
    const assistantMessage: Message = { role: 'assistant', content: '' };
    messages.push(assistantMessage);

    await scrollToBottom();

    try {
      await sendMessage(
        messages.slice(0, -1).map(m => ({ role: m.role, content: m.content })),
        (token) => {
          assistantMessage.content += token;
          scrollToBottom();
        }
      );
    } catch (err) {
      assistantMessage.content = 'Sorry, something went wrong. Please try again.';
    } finally {
      isLoading = false;
    }
  }

  function renderMarkdown(content: string): string {
    return marked.parse(content, { async: false }) as string;
  }
</script>

<div class="chat-app">
  <div class="chat-messages" bind:this={chatContainer}>
    {#if messages.length === 0}
      <div class="empty-state">
        <h2>How can I help you?</h2>
        <p>Send a message to start a conversation.</p>
      </div>
    {/if}

    {#each messages as message}
      <div class="message {message.role}">
        <div class="message-avatar">
          {message.role === 'user' ? 'You' : 'AI'}
        </div>
        <div class="message-content">
          {#if message.role === 'assistant'}
            {@html renderMarkdown(message.content)}
          {:else}
            <p>{message.content}</p>
          {/if}
        </div>
      </div>
    {/each}

    {#if isLoading && messages.at(-1)?.content === ''}
      <div class="typing-indicator">
        <span></span><span></span><span></span>
      </div>
    {/if}
  </div>

  <form class="chat-input" onsubmit={handleSubmit}>
    <input
      type="text"
      bind:value={input}
      placeholder="Type a message..."
      disabled={isLoading}
    />
    <button type="submit" disabled={isLoading || !input.trim()}>
      Send
    </button>
  </form>
</div>

<style>
  .chat-app {
    display: flex;
    flex-direction: column;
    height: 100vh;
    max-width: 800px;
    margin: 0 auto;
  }

  .chat-messages {
    flex: 1;
    overflow-y: auto;
    padding: 20px;
  }

  .message {
    display: flex;
    gap: 12px;
    margin-bottom: 20px;
  }

  .message-avatar {
    width: 36px;
    height: 36px;
    border-radius: 50%;
    display: flex;
    align-items: center;
    justify-content: center;
    font-size: 0.75rem;
    font-weight: 600;
    flex-shrink: 0;
    background: #333;
    color: #fff;
  }

  .message.user .message-avatar {
    background: #0A84FF;
  }

  .message-content {
    flex: 1;
    line-height: 1.6;
  }

  .chat-input {
    display: flex;
    gap: 8px;
    padding: 16px 20px;
    border-top: 1px solid #333;
  }

  .chat-input input {
    flex: 1;
    padding: 12px 16px;
    border: 1px solid #333;
    border-radius: 8px;
    background: transparent;
    color: inherit;
    font-size: 1rem;
  }

  .chat-input button {
    padding: 12px 24px;
    background: #0A84FF;
    color: white;
    border: none;
    border-radius: 8px;
    font-weight: 600;
    cursor: pointer;
  }

  .chat-input button:disabled {
    opacity: 0.5;
    cursor: not-allowed;
  }

  .empty-state {
    text-align: center;
    padding: 60px 20px;
    color: #888;
  }

  .typing-indicator {
    display: flex;
    gap: 4px;
    padding: 8px 16px;
  }

  .typing-indicator span {
    width: 8px;
    height: 8px;
    border-radius: 50%;
    background: #666;
    animation: bounce 1.4s infinite ease-in-out;
  }

  .typing-indicator span:nth-child(2) { animation-delay: 0.2s; }
  .typing-indicator span:nth-child(3) { animation-delay: 0.4s; }

  @keyframes bounce {
    0%, 80%, 100% { transform: translateY(0); }
    40% { transform: translateY(-6px); }
  }
</style>

Notice how Svelte 5's $state makes the streaming experience seamless. We push to the messages array and mutate assistantMessage.content directly — the deep reactivity proxy tracks these mutations and updates the DOM automatically.

Storing Conversation History

The chatbot above stores messages in memory — they disappear on page refresh. For persistent conversations, store them in a database.

With Supabase

CREATE TABLE conversations (
  id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
  user_id UUID REFERENCES auth.users(id),
  title TEXT,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE chat_messages (
  id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
  conversation_id UUID REFERENCES conversations(id) ON DELETE CASCADE,
  role TEXT NOT NULL CHECK (role IN ('user', 'assistant')),
  content TEXT NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

Save messages after each exchange:

// In your API route or a separate server function
async function saveMessage(
  supabase: SupabaseClient,
  conversationId: string,
  role: string,
  content: string
) {
  await supabase.from('chat_messages').insert({
    conversation_id: conversationId,
    role,
    content
  });
}

Load conversation history when the page loads:

// src/routes/chat/[id]/+page.server.ts
export async function load({ params, locals: { supabase } }) {
  const { data: messages } = await supabase
    .from('chat_messages')
    .select('role, content, created_at')
    .eq('conversation_id', params.id)
    .order('created_at', { ascending: true });

  return { messages: messages ?? [], conversationId: params.id };
}

Deploying Your AI Chatbot

Deploy to Vercel with the standard SvelteKit adapter:

npm install @sveltejs/adapter-vercel

Set your AI provider API key as an environment variable in the Vercel dashboard. The server endpoint runs as a serverless function, and the streaming response works natively with Vercel's Edge and Serverless runtimes.

For production, consider adding:

Rate limiting to prevent API cost abuse
Input validation to sanitize user messages
Error boundaries to handle API failures gracefully
Token counting to manage context window limits

Building an AI chatbot with SvelteKit is straightforward once you understand the streaming pattern. If you want to skip the boilerplate and have an AI build the chatbot for you, Teta can generate the entire chat interface, API route, and database schema from a description — which is, admittedly, a chatbot building a chatbot.

FAQ

Can I build a chatbot with SvelteKit?

Yes, SvelteKit is well-suited for building AI chatbots. Its server endpoints (+server.ts) handle the API communication with AI providers, its streaming support delivers responses token by token, and Svelte's reactive UI makes the chat interface update smoothly. The full-stack nature of SvelteKit means your frontend, API routes, and streaming logic all live in one project.

Which AI API should I use?

The most popular choices are OpenAI's GPT-4o and Anthropic's Claude. Both offer excellent quality and streaming support. OpenAI has a larger ecosystem and more third-party integrations. Claude tends to produce longer, more thoughtful responses and handles complex instructions well. Both have similar pricing and TypeScript SDKs that work cleanly with SvelteKit server endpoints.

How do I stream responses in SvelteKit?

Create a +server.ts endpoint that returns a ReadableStream. Inside the stream, iterate over the AI provider's streaming response and enqueue each token. On the client, use response.body.getReader() to read chunks as they arrive and update the UI progressively. This gives the chatbot a "typing" effect that feels natural and responsive.

Can I add memory to my chatbot?

Yes, there are two approaches. For short-term memory, include the full conversation history in each API request — the AI model sees all previous messages and maintains context. For long-term memory across sessions, store conversations in a database (Supabase works well) and load them when the user returns. You can also implement summarization to compress long conversation histories and stay within the model's context window.

Frequently Asked Questions

Can I build a chatbot with SvelteKit?

Yes, SvelteKit is well-suited for building AI chatbots. Its server endpoints ( +server.ts ) handle the API communication with AI providers, its streaming support delivers responses token by token, and Svelte's reactive UI makes the chat interface update smoothly. The full-stack nature of SvelteKit means your frontend, API routes, and streaming logic all live in one project.

Which AI API should I use?

How do I stream responses in SvelteKit?

Create a +server.ts endpoint that returns a ReadableStream . Inside the stream, iterate over the AI provider's streaming response and enqueue each token. On the client, use response.body.getReader() to read chunks as they arrive and update the UI progressively. This gives the chatbot a "typing" effect that feels natural and responsive.

Can I add memory to my chatbot?

Ready to start building?

Create your next web app with AI-powered development tools.

Empieza Gratis

← All articles