Build an AI Chatbot with Next.js and OpenAI
Step-by-step guide to building a streaming AI chatbot with Next.js App Router, Vercel AI SDK, and OpenAI GPT — with chat UI, message history, and rate limiting.
Tags
Build an AI Chatbot with Next.js and OpenAI
In this tutorial, you will build a fully functional AI chatbot using Next.js App Router, the Vercel AI SDK, and OpenAI's GPT models. By the end, you will have a streaming chat interface that displays responses token by token, maintains conversation history, handles errors gracefully, and includes rate limiting to protect your API key. The complete application runs on a single Next.js deployment with no additional backend required.
TL;DR
Use the Vercel AI SDK's useChat hook on the client and streamText on the server to build a streaming chatbot with minimal boilerplate. The SDK handles message state, streaming, and error recovery. Add rate limiting with a simple middleware to protect your OpenAI API costs in production.
Prerequisites
Before starting, make sure you have the following:
- ›Node.js 18 or later installed
- ›An OpenAI API key (sign up at platform.openai.com)
- ›Basic knowledge of React and Next.js App Router
- ›A code editor like VS Code
Step 1: Initialize the Next.js Project
Start by creating a fresh Next.js application with TypeScript and the App Router.
npx create-next-app@latest ai-chatbot --typescript --tailwind --app --src-dir
cd ai-chatbotInstall the Vercel AI SDK and the OpenAI provider:
npm install ai @ai-sdk/openaiThe ai package provides the core SDK with React hooks and streaming utilities. The @ai-sdk/openai package is the provider that connects to OpenAI's API.
Create a .env.local file in the project root with your API key:
OPENAI_API_KEY=sk-your-api-key-hereNever commit this file to version control. The .gitignore generated by create-next-app already excludes .env.local.
Step 2: Create the API Route Handler
The API route is where the server communicates with OpenAI and streams the response back to the client. Create a new route handler at src/app/api/chat/route.ts.
// src/app/api/chat/route.ts
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
export const maxDuration = 30;
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({
model: openai("gpt-4o"),
system:
"You are a helpful assistant. Be concise and clear in your responses.",
messages,
});
return result.toDataStreamResponse();
}Here is what each piece does. The streamText function sends the conversation history to OpenAI and returns a streaming result object. The system parameter sets the AI's behavior for the entire conversation. The toDataStreamResponse() method converts the stream into a format the Vercel AI SDK client can parse. The maxDuration export tells Vercel's serverless runtime to allow up to 30 seconds for the response, which is important because AI responses can take time to generate.
The messages array contains the full conversation history. Each message has a role (either "user" or "assistant") and content (the text). The SDK automatically manages this array on the client side.
Step 3: Build the Chat UI Component
Create a chat component that uses the useChat hook to manage the conversation state and render messages. Create src/components/Chat.tsx:
// src/components/Chat.tsx
"use client";
import { useChat } from "ai/react";
import { useRef, useEffect } from "react";
export default function Chat() {
const { messages, input, handleInputChange, handleSubmit, isLoading, error } =
useChat({
api: "/api/chat",
});
const messagesEndRef = useRef<HTMLDivElement>(null);
useEffect(() => {
messagesEndRef.current?.scrollIntoView({ behavior: "smooth" });
}, [messages]);
return (
<div className="flex flex-col h-[600px] max-w-2xl mx-auto border rounded-lg">
{/* Message List */}
<div className="flex-1 overflow-y-auto p-4 space-y-4">
{messages.length === 0 && (
<div className="text-center text-gray-500 mt-8">
<p className="text-lg font-medium">Welcome to AI Chat</p>
<p className="text-sm">Send a message to start the conversation.</p>
</div>
)}
{messages.map((message) => (
<div
key={message.id}
className={`flex ${
message.role === "user" ? "justify-end" : "justify-start"
}`}
>
<div
className={`max-w-[80%] rounded-lg px-4 py-2 ${
message.role === "user"
? "bg-blue-600 text-white"
: "bg-gray-100 text-gray-900"
}`}
>
<p className="text-sm whitespace-pre-wrap">{message.content}</p>
</div>
</div>
))}
{isLoading && (
<div className="flex justify-start">
<div className="bg-gray-100 rounded-lg px-4 py-2">
<div className="flex space-x-1">
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" />
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce delay-100" />
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce delay-200" />
</div>
</div>
</div>
)}
<div ref={messagesEndRef} />
</div>
{/* Error Display */}
{error && (
<div className="px-4 py-2 bg-red-50 border-t border-red-200">
<p className="text-sm text-red-600">
Something went wrong. Please try again.
</p>
</div>
)}
{/* Input Form */}
<form
onSubmit={handleSubmit}
className="border-t p-4 flex items-center gap-2"
>
<input
value={input}
onChange={handleInputChange}
placeholder="Type your message..."
className="flex-1 border rounded-lg px-4 py-2 focus:outline-none focus:ring-2 focus:ring-blue-500"
disabled={isLoading}
/>
<button
type="submit"
disabled={isLoading || !input.trim()}
className="bg-blue-600 text-white px-4 py-2 rounded-lg hover:bg-blue-700 disabled:opacity-50 disabled:cursor-not-allowed"
>
Send
</button>
</form>
</div>
);
}The useChat hook is doing the heavy lifting. It manages the messages array, handles form submission, sends requests to your API route, parses the streaming response, and updates the UI in real time. The isLoading flag lets you show a typing indicator while the AI generates its response.
Step 4: Add the Chat to Your Page
Update your main page to render the Chat component:
// src/app/page.tsx
import Chat from "@/components/Chat";
export default function Home() {
return (
<main className="min-h-screen p-8">
<h1 className="text-3xl font-bold text-center mb-8">AI Chatbot</h1>
<Chat />
</main>
);
}At this point, you can run npm run dev and test the chatbot. Messages stream in token by token, and the conversation history is maintained automatically.
Step 5: Add Message History Persistence
The useChat hook stores messages in React state, which means they disappear on page refresh. To persist conversations, store them in localStorage and restore them on mount.
// src/hooks/useChatWithHistory.ts
"use client";
import { useChat } from "ai/react";
import { useEffect, useCallback } from "react";
const STORAGE_KEY = "chat-messages";
export function useChatWithHistory() {
const chatHelpers = useChat({
api: "/api/chat",
initialMessages: getStoredMessages(),
});
// Persist messages whenever they change
useEffect(() => {
if (chatHelpers.messages.length > 0) {
localStorage.setItem(
STORAGE_KEY,
JSON.stringify(chatHelpers.messages)
);
}
}, [chatHelpers.messages]);
const clearHistory = useCallback(() => {
localStorage.removeItem(STORAGE_KEY);
chatHelpers.setMessages([]);
}, [chatHelpers]);
return { ...chatHelpers, clearHistory };
}
function getStoredMessages() {
if (typeof window === "undefined") return [];
try {
const stored = localStorage.getItem(STORAGE_KEY);
return stored ? JSON.parse(stored) : [];
} catch {
return [];
}
}Replace the useChat call in your Chat component with useChatWithHistory. Add a "Clear History" button to let users start fresh:
const { messages, input, handleInputChange, handleSubmit, isLoading, error, clearHistory } =
useChatWithHistory();For production applications where you need server-side persistence, store messages in a database keyed by user session. The Vercel AI SDK supports an onFinish callback in the route handler where you can save completed messages.
Step 6: Implement Error Handling
Robust error handling covers network failures, API errors, and invalid responses. Update the route handler to catch specific error types:
// src/app/api/chat/route.ts
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
export const maxDuration = 30;
export async function POST(req: Request) {
try {
const { messages } = await req.json();
if (!messages || !Array.isArray(messages)) {
return new Response("Invalid request body", { status: 400 });
}
// Limit conversation length to control costs
const trimmedMessages = messages.slice(-20);
const result = streamText({
model: openai("gpt-4o"),
system:
"You are a helpful assistant. Be concise and clear in your responses.",
messages: trimmedMessages,
});
return result.toDataStreamResponse();
} catch (error: unknown) {
if (error instanceof Error) {
if (error.message.includes("rate_limit")) {
return new Response("Rate limit exceeded. Please wait and try again.", {
status: 429,
});
}
if (error.message.includes("invalid_api_key")) {
return new Response("Server configuration error.", { status: 500 });
}
}
return new Response("An unexpected error occurred.", { status: 500 });
}
}The trimmedMessages line is important: it limits the conversation context sent to OpenAI to the last 20 messages. Without this, long conversations send increasingly large payloads, which increases latency and cost. In production, you might implement a more sophisticated approach that summarizes older messages instead of discarding them.
On the client side, the useChat hook provides an error object you can display, and an onError callback for custom handling:
const { messages, input, handleInputChange, handleSubmit, isLoading, error } =
useChat({
api: "/api/chat",
onError: (error) => {
console.error("Chat error:", error);
},
});Step 7: Add Rate Limiting
Rate limiting protects your API key from abuse. Here is a simple in-memory rate limiter suitable for single-instance deployments. For multi-instance production setups, use Redis instead.
// src/lib/rate-limit.ts
const requests = new Map<string, { count: number; resetTime: number }>();
const WINDOW_MS = 60 * 1000; // 1 minute
const MAX_REQUESTS = 10; // 10 requests per minute
export function rateLimit(identifier: string): {
allowed: boolean;
remaining: number;
} {
const now = Date.now();
const record = requests.get(identifier);
if (!record || now > record.resetTime) {
requests.set(identifier, { count: 1, resetTime: now + WINDOW_MS });
return { allowed: true, remaining: MAX_REQUESTS - 1 };
}
if (record.count >= MAX_REQUESTS) {
return { allowed: false, remaining: 0 };
}
record.count++;
return { allowed: true, remaining: MAX_REQUESTS - record.count };
}Integrate the rate limiter into your route handler:
// src/app/api/chat/route.ts
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
import { rateLimit } from "@/lib/rate-limit";
import { headers } from "next/headers";
export const maxDuration = 30;
export async function POST(req: Request) {
const headersList = await headers();
const ip = headersList.get("x-forwarded-for") ?? "anonymous";
const { allowed, remaining } = rateLimit(ip);
if (!allowed) {
return new Response("Too many requests. Please wait a moment.", {
status: 429,
headers: { "X-RateLimit-Remaining": "0" },
});
}
try {
const { messages } = await req.json();
const trimmedMessages = messages.slice(-20);
const result = streamText({
model: openai("gpt-4o"),
system:
"You are a helpful assistant. Be concise and clear in your responses.",
messages: trimmedMessages,
});
return result.toDataStreamResponse({
headers: { "X-RateLimit-Remaining": remaining.toString() },
});
} catch {
return new Response("An unexpected error occurred.", { status: 500 });
}
}For a production deployment, replace the in-memory Map with a Redis-backed solution using @upstash/ratelimit, which works well with serverless functions where instances can spin up and down.
Step 8: Polish the Chat UI
Add a few finishing touches to make the chatbot feel production-ready. Add markdown rendering for AI responses using react-markdown:
npm install react-markdown// In your Chat component, replace the plain text rendering:
import ReactMarkdown from "react-markdown";
// Inside the message rendering:
{message.role === "assistant" ? (
<ReactMarkdown className="prose prose-sm max-w-none">
{message.content}
</ReactMarkdown>
) : (
<p className="text-sm whitespace-pre-wrap">{message.content}</p>
)}This lets the AI format responses with headings, code blocks, lists, and other markdown elements, which significantly improves readability for technical answers.
The Complete Architecture
Here is how the pieces fit together:
- ›The user types a message in the Chat component
- ›The
useChathook adds the message to the local state and sends the full message history to/api/chat - ›The route handler checks rate limits, trims the message history, and calls OpenAI via
streamText - ›OpenAI generates tokens one at a time, which stream back through the route handler as server-sent events
- ›The
useChathook receives each token and updates the assistant message in real time - ›Message history is persisted to localStorage for page refresh recovery
The streaming architecture means users see the first tokens within a few hundred milliseconds, even though the full response might take several seconds to generate.
Next Steps
Once your chatbot is working, consider these enhancements:
- ›Authentication: Add user authentication with Clerk or NextAuth to associate conversations with user accounts
- ›Database storage: Replace localStorage with a database like PostgreSQL to persist conversations server-side
- ›Multiple conversations: Let users create, switch between, and delete separate chat threads
- ›System prompt customization: Let users or admins configure the AI's personality and knowledge boundaries
- ›File uploads: Use OpenAI's vision capabilities to let users upload images for the AI to analyze
- ›Function calling: Give the AI the ability to call functions, look up data, or perform actions on behalf of the user
FAQ
What is the Vercel AI SDK?
The Vercel AI SDK is an open-source TypeScript library that provides React hooks and server utilities for building AI-powered streaming interfaces. It abstracts away the complexity of handling server-sent events, token streaming, and message state management, giving you hooks like useChat that handle the entire chat lifecycle.
How does streaming work in an AI chatbot?
Streaming allows the AI response to appear token by token in real time instead of waiting for the entire response to generate. The server sends partial responses as server-sent events, and the client renders each chunk as it arrives. This dramatically improves perceived performance since users see the response forming immediately.
Can I use models other than OpenAI with the Vercel AI SDK?
Yes, the Vercel AI SDK supports multiple providers including Anthropic Claude, Google Gemini, Mistral, Cohere, and any OpenAI-compatible API. You swap the provider in your route handler while keeping the same client-side useChat hook, making it easy to switch or test different models.
How do I handle rate limiting for an AI chatbot?
Rate limiting prevents abuse and controls API costs. A common approach uses an in-memory store or Redis to track requests per user per time window. In Next.js, you implement this in your API route handler, returning a 429 status code when the limit is exceeded, and the client displays an appropriate message.
How much does it cost to run an AI chatbot with OpenAI?
Costs depend on the model and usage. GPT-3.5-turbo is significantly cheaper than GPT-4. A typical chat message with context costs a few cents with GPT-4 and fractions of a cent with GPT-3.5-turbo. Implementing message trimming, caching frequent responses, and rate limiting helps control costs in production.
Collaboration
Need help with a project?
Let's Build It
I help startups and established companies design, build, and scale world-class digital products. From deep technical architecture to pixel-perfect UI — let's bring your vision to life.
Related Articles
How to Add Observability to a Node.js App with OpenTelemetry
Learn how to instrument a Node.js app with OpenTelemetry for traces, metrics, and logs, and build a practical observability setup for production debugging.
How to Build a Backend-for-Frontend (BFF) with Next.js and Node.js
A practical guide to building a Backend-for-Frontend with Next.js and Node.js for API aggregation, auth handling, caching, and frontend-specific data shaping.
How I Structure CI/CD for Next.js, Docker, and GitHub Actions
A practical CI/CD blueprint for Next.js apps using Docker and GitHub Actions, including testing, image builds, deployment stages, cache strategy, and release safety.