Resume RAG Chatbot
Resume RAG Chatbot is a production-focused software project built by Sadam Hussain using Next.js, React, TypeScript, and related technologies.
An AI-powered portfolio assistant built into a Next.js site that answers visitor questions about Sadam's background using retrieval-augmented generation. The system parses resume content into semantic chunks, stores embeddings in PostgreSQL with pgvector, retrieves the most relevant context for each question, and streams grounded responses through the Vercel AI SDK. The implementation combines a polished chat widget, typed server routes, Gemini-powered embeddings and generation, and guardrails that keep responses tied to documented experience while still translating that experience into useful team and product outcomes.
Tech Stack

Status
Production Ready
Type
Enterprise platform
Last Updated
April 18, 2026
Built a portfolio-native RAG chatbot that turns a static resume into an interactive assistant, allowing visitors to ask natural-language questions about experience, projects, technical depth, and delivery strengths directly from the site.
Designed a semantic ingestion pipeline that parses `resume.mdx` into structured chunks by heading boundaries, preserving contextual meaning while keeping chunk size appropriate for vector retrieval.
Implemented embeddings generation with Gemini and stored vectors in PostgreSQL using pgvector, enabling fast similarity search without adding a separate vector database to the architecture.
Integrated a typed AI SDK chat route that validates `UIMessage[]`, performs retrieval for the latest user query, converts validated messages to model messages, and streams responses back to the client in real time.
Added response guardrails so answers remain grounded in the resume while still explaining what Sadam has done and how that experience can help teams and products.
Built a floating chat widget with a contained scrollable panel, starter questions, streaming UI states, markdown rendering, and error handling to make the assistant feel like a product feature rather than a demo.
Introduced caching and graceful fallback behavior in the retrieval layer so the assistant can degrade safely when embeddings or retrieval fail, instead of collapsing the entire chat experience.
Structured the implementation for maintainability with clear separation between chunking, embeddings, retrieval, API orchestration, and UI rendering.
Resume RAG Chatbot
Overview
This project transforms a personal portfolio from a static presentation layer into an interactive AI experience. Instead of forcing visitors to scan a long resume or infer relevance from project cards alone, the site now includes a retrieval-augmented chatbot that can answer focused questions about Sadam's technical background, project history, and engineering strengths in conversational language.
The core idea was simple: make the portfolio easier to explore without sacrificing factual grounding. Rather than using a generic chatbot prompt and hoping the model responds accurately, the system retrieves relevant resume context for each question and constrains the answer to documented experience. The result is a more helpful visitor experience that still respects the boundaries of what's actually on the resume.
Business Context: Turning Portfolio Content into a Searchable Product Surface
A traditional portfolio asks visitors to do too much interpretation work. Recruiters, hiring managers, founders, and collaborators often have very specific questions:
- ›Has he led full-stack projects?
- ›Does he have backend depth or mostly frontend experience?
- ›What kinds of systems has he built?
- ›How might this background help a product team right now?
When those answers are buried across a resume page, project pages, and blog posts, the portfolio becomes a discovery problem. The chatbot solves that by acting as a guided interface over the most important experience content.
The core product problem: Static portfolio content is informative, but not queryable. Visitors need a faster way to map Sadam's experience to their own team or product needs.
What success looks like:
- ›Visitors can ask direct, role-relevant questions in natural language.
- ›Responses stay grounded in documented experience instead of generic AI filler.
- ›Answers connect technical background to likely delivery outcomes for teams and products.
- ›The chatbot feels integrated into the portfolio UX, not bolted on as a novelty feature.
What I Built
1. Resume Ingestion and Semantic Chunking
- ›MDX-Based Source of Truth: Reused the site's existing resume content (
public/resume/resume.mdx) as the knowledge source instead of introducing a second content system. - ›Heading-Aware Chunking: Built a parser that splits the resume into semantic chunks using section and subsection headings. This preserves meaning better than arbitrary fixed-size slicing because work experience, skills, and project summaries remain grouped by topic.
- ›Chunk Validation Rules: Enforced chunk-size guardrails so retrieval remains effective and chunks aren't too sparse or too bloated to be useful in prompts.
2. Vector Retrieval Layer
- ›Embeddings with Gemini: Generated embeddings using Google's embedding model and stored them as 3072-dimensional vectors.
- ›pgvector in PostgreSQL: Used pgvector in the existing Postgres setup so embeddings live alongside the rest of the application data, avoiding the operational cost of a dedicated vector database.
- ›Similarity Search: Implemented cosine similarity retrieval that returns the top relevant chunks for the current user question with score thresholds to reduce noisy context injection.
- ›Caching: Added an in-memory cache for repeated embeddings requests to reduce redundant provider calls.
3. AI SDK Chat Pipeline
- ›Typed Chat Transport: Built the server route around the Vercel AI SDK's
UIMessage[]contract, validating incoming chat history before converting messages for model execution. - ›Retrieval-Augmented Prompting: For each incoming user message, the route extracts the latest user text, retrieves the most relevant resume chunks, injects them into the system prompt, and streams the answer back through
toUIMessageStreamResponse(). - ›Retry Control: Disabled default SDK retries for embeddings and generation during debugging to avoid repeated provider calls and make failure behavior predictable.
- ›Guardrails: Added instruction design that keeps the assistant grounded in resume evidence while encouraging answers to explain both what Sadam has done and what that experience suggests for teams or products.
4. Chat Experience in the Portfolio UI
- ›Floating Widget Pattern: Implemented a bottom-corner widget that opens into a dedicated chat panel, making the assistant accessible without interrupting the rest of the portfolio browsing flow.
- ›Contained Scroll Behavior: Refined the panel layout so scrolling remains isolated to the chat window rather than leaking to the page underneath.
- ›Streaming Responses: The UI shows the response as it arrives, making the interaction feel responsive and alive.
- ›Markdown Rendering: Added markdown support in assistant messages so lists, emphasis, inline code, and formatted answers render cleanly instead of appearing as raw text.
- ›Starter Questions and Error States: Included suggested prompts, loading affordances, and graceful error messaging to reduce friction for first-time users.
Architecture Highlights
Retrieval as a Grounding Layer, Not a Sidecar
The chatbot is not just "an LLM on top of a prompt." The retrieval layer is the core mechanism that keeps the assistant useful and trustworthy. Instead of asking the model to improvise from general knowledge, the system fetches the most relevant resume chunks at request time and uses them as the grounding context for generation.
That design choice matters because it improves both accuracy and usefulness:
- ›Accuracy improves because responses are tied to real content rather than generic inference.
- ›Usefulness improves because the system can answer narrow, role-specific questions from the most relevant experience slice.
Content Reuse Through Existing MDX
Rather than creating a separate admin panel or knowledge base interface, the implementation reuses the existing resume MDX as the canonical source. That keeps the content workflow simple: update the resume, regenerate embeddings, and the assistant reflects the latest experience narrative.
Small-System, Production-Minded Architecture
The architecture is intentionally compact:
- ›Resume content in MDX
- ›Semantic chunking
- ›Embeddings generation
- ›pgvector similarity retrieval
- ›AI SDK route for orchestration
- ›React chat UI for delivery
This keeps the system easy to understand and maintain while still showcasing the full lifecycle of a RAG feature in a real product environment.
Technical Deep Dive: Portfolio-Native RAG Flow
The request lifecycle is straightforward but purposeful:
- ›A visitor submits a question in the chat panel.
- ›The latest user message is extracted from the AI SDK
UIMessage[]payload. - ›The retrieval layer embeds the query and performs cosine similarity search against resume chunk embeddings stored in PostgreSQL.
- ›The top relevant chunks are assembled into a context block with relevance hints.
- ›The chat route composes a system prompt that instructs the model to answer only from that context and to connect the documented experience to potential team or product outcomes.
- ›Gemini generates the response and the AI SDK streams it back to the client.
- ›The client renders the answer with markdown support inside the chat panel.
The important product insight is that the chatbot is not only answering factual questions. It is also helping visitors interpret relevance. A static resume says what happened. This assistant helps explain why it matters.
Key Challenges & Solutions
- ›
Challenge: Static portfolio content is difficult to query conversationally. Approach: Built a retrieval-first assistant over resume content so visitors can ask direct questions in natural language. Why: This reduces the cognitive load of navigating a portfolio and increases the chances that the right strengths are discovered quickly.
- ›
Challenge: LLM answers can become generic or drift beyond the source material. Approach: Retrieved only the most relevant resume chunks and added strict prompt instructions to stay grounded in documented experience. Why: A portfolio assistant needs credibility more than creativity. Grounding matters more than flourish.
- ›
Challenge: Chat UI quality often breaks the illusion of product polish. Approach: Added streaming states, starter prompts, markdown rendering, contained scrolling, and tighter panel behavior. Why: If the assistant feels clunky, users treat it like a demo. If it feels polished, they use it as part of the portfolio experience.
- ›
Challenge: RAG systems can introduce unnecessary infrastructure complexity. Approach: Used Postgres + pgvector and reused existing MDX content rather than introducing separate CMS and vector infrastructure. Why: The goal was to demonstrate product-quality AI integration, not accumulate avoidable system overhead.
Outcomes
- ›More Queryable Portfolio Experience: Visitors can now interrogate the portfolio directly instead of manually assembling context from multiple pages.
- ›Better Experience-to-Outcome Translation: Responses explain not just what Sadam has built, but how that background can support product delivery, engineering quality, and cross-functional execution.
- ›Real Product Showcase: The chatbot demonstrates applied AI engineering inside the portfolio itself, making the site a live example of the kind of features Sadam can build.
- ›Grounded Responses: Resume-based retrieval keeps the system anchored to documented experience rather than vague LLM generalities.
Engineering Takeaways
This project reinforced a useful product principle: AI features are most convincing when they sit directly inside a real workflow. A standalone demo is easy to dismiss. An assistant embedded in a portfolio that improves discovery, interpretation, and engagement is much more compelling.
Patterns I'd reuse:
- ›Retrieval over existing content sources rather than creating redundant admin workflows
- ›pgvector as the first vector layer when Postgres is already in the stack
- ›AI SDK chat abstractions for building typed, streaming UI experiences quickly
- ›Guardrail prompts that combine evidence grounding with user-centered interpretation
What I'd improve next:
- ›Expand the retrieval corpus beyond the resume to include selected projects and case studies
- ›Add source citations or linked references in responses so visitors can jump directly to supporting content
- ›Introduce evaluation prompts or test fixtures to benchmark answer quality across common visitor questions
Trade-offs acknowledged:
- ›Chose a small, focused knowledge base (resume only) to keep grounding strong, accepting that some broader portfolio questions may not yet have enough context
- ›Chose pgvector over a specialized retrieval stack to keep architecture simple and integrated
- ›Chose prompt-level outcome framing rather than explicit scoring/ranking heuristics, which keeps the system flexible but makes response style more dependent on prompt quality