PROJECT
Lead Full Stack Developer

Preplify: AI Interview Prep

Preplify: AI Interview Prep is a production-focused software project built by Sadam Hussain using Next.js, TypeScript, Node.js, and related technologies.

A production-grade, cloud-native AI interview preparation platform built on Next.js and TypeScript, designed to serve thousands of concurrent users with real-time, personalized interview experiences. The platform integrates multiple LLM providers (OpenAI and Gemini) through a unified orchestration layer, combining GraphQL for complex data relationships with REST endpoints for high-throughput AI workflows. Deployed via Docker-based CI/CD pipelines with GitHub Actions on scalable AWS infrastructure, the system features WebSocket-driven real-time communication, Zustand-powered state management for multi-step interview flows, and a mature component library that accelerates frontend feature delivery.

Tech Stack

Next.jsTypeScriptNode.jsREST APIsGraphQLLLM APIsOpenAIGeminiWebSocketsZustandTailwind CSSClerk AuthJestReact Testing LibraryPlaywrightGitHub ActionsDockerAWS
Preplify: AI Interview Prep

Status

Production Ready

Type

Enterprise platform

Last Updated

April 18, 2026

Architected and delivered a scalable headless Next.js + TypeScript platform for AI-driven interview preparation, supporting thousands of concurrent users across a high-traffic, production-grade environment.

Designed a multi-provider LLM orchestration layer integrating OpenAI and Gemini APIs with provider-agnostic abstractions, enabling seamless failover and model switching without client-side changes.

Architected a hybrid API layer combining GraphQL (for complex relational queries like user history, session analytics, and role-based content) with REST endpoints (for high-throughput, stateless AI inference requests), each optimized for its specific access pattern.

Implemented strict API versioning and comprehensive error handling across all service boundaries, including retry logic with exponential backoff for LLM provider rate limits and transient failures.

Engineered 30+ reusable atomic UI components with Tailwind CSS following a systematic design token approach, establishing a consistent design system that accelerated frontend feature delivery across the platform.

Built real-time mock interview communication using WebSockets with automatic reconnection, heartbeat monitoring, and graceful degradation to polling when persistent connections are unavailable.

Implemented multi-step interview state management using Zustand with middleware for session persistence, enabling users to resume interrupted practice sessions with full context restoration.

Integrated Clerk Auth for secure authentication with role-based access control, session management, and OAuth provider support across the platform.

Optimized rendering performance through strategic use of SSR for SEO-critical pages, ISR for semi-dynamic content, and client-side rendering for interactive interview flows, with code-splitting at the route level.

Established a comprehensive testing pyramid with Jest and React Testing Library for unit/integration coverage, Playwright for end-to-end critical path validation, and GitHub Actions for automated CI/CD with Docker-based deployments to AWS.

Preplify: AI Interview Prep

Overview

Preplify is a production-grade, cloud-native AI interview preparation platform designed to make career readiness more realistic, personalized, and actionable. The platform's core technical differentiator is its multi-provider LLM orchestration layer—rather than binding to a single AI provider, Preplify abstracts provider-specific APIs behind a unified interface, enabling dynamic model selection, failover, and A/B testing of different LLM backends without any client-side awareness.

By combining real-time LLM integration with an interactive simulation environment, the platform transforms traditional static study into a high-fidelity coaching experience. The system handles thousands of concurrent practice sessions, each maintaining complex multi-step state while coordinating real-time AI responses with sub-second latency.

Business Context: The Future of Career Readiness

Traditional interview preparation often relies on static question banks that fail to mirror the dynamic, unpredictable nature of real conversations. Candidates practice with fixed questions and generic answers, which doesn't build the adaptive thinking that interviewers actually evaluate.

Preplify was built to solve this by providing a "living" preparation environment where every session is unique and contextually tailored.

The core problem: Static preparation creates a false sense of readiness. Candidates memorize answers instead of developing the ability to think on their feet under pressure.

Key product objectives:

  • Dynamic Contextualization: Moving beyond static lists to generate role-specific, contextual prompts in real-time based on the candidate's target role, experience level, and practice history.
  • Immediate Feedback Loops: Providing candidates with actionable, AI-driven critiques on their performance—not just "right or wrong" but structured analysis of communication clarity, technical depth, and reasoning approach.
  • Session Continuity: Ensuring that practice progress, strengths, and areas for improvement persist across sessions, building a longitudinal learning profile.
  • Enterprise-Grade Stability: Maintaining high reliability for session-critical practice flows where an interruption mid-interview would degrade the user experience significantly.

Constraints that shaped the architecture:

  • LLM APIs have unpredictable latency and rate limits, requiring resilient orchestration rather than simple request-response patterns.
  • Interview sessions are stateful and long-lived, requiring careful state management that survives page refreshes and network interruptions.
  • The platform needed to support multiple AI providers to avoid vendor lock-in and enable cost optimization across different question types.

What I Built

1. Multi-Provider LLM Orchestration Layer

  • Provider-Agnostic Abstraction: Designed a unified interface that wraps OpenAI and Gemini APIs behind a common contract, allowing the backend to route requests to different providers based on question type, latency requirements, or cost considerations—without any changes to the client.
  • Prompt Management Pipeline: Built a structured prompt engineering pipeline where system prompts, context injection (user history, role requirements), and response format instructions are composed programmatically rather than hardcoded, enabling rapid iteration on AI behavior without code deployments.
  • Resilient Request Handling: Implemented retry logic with exponential backoff, circuit breaker patterns for provider outages, and automatic failover between LLM providers when one exceeds latency thresholds or returns errors.

2. Hybrid API Architecture (GraphQL + REST)

  • GraphQL for Relational Data: Used GraphQL for queries that traverse complex relationships—user profiles linked to session histories, performance analytics aggregated across practice areas, and role-based content filtering. GraphQL's declarative nature eliminated over-fetching in these data-heavy views.
  • REST for AI Workflows: Kept AI inference endpoints as REST services for simplicity and cacheability. These stateless endpoints handle high-throughput question generation and feedback requests where the request/response pattern is straightforward and doesn't benefit from GraphQL's query flexibility.
  • API Versioning Strategy: Implemented URL-based versioning for REST and schema evolution for GraphQL, ensuring backward compatibility as the AI capabilities expanded without breaking existing client integrations.

3. Real-Time Interactive Interview Engine

  • WebSocket Communication Layer: Built persistent WebSocket connections for mock interview sessions, enabling real-time back-and-forth between the candidate and AI interviewer. Implemented automatic reconnection with state synchronization to handle network interruptions transparently.
  • Session State Machine: Modeled interview sessions as explicit state machines (idle → in-progress → reviewing → completed) using Zustand, with middleware that persists state to localStorage for crash recovery and cross-tab consistency.
  • Streaming AI Responses: Implemented server-sent streaming for LLM responses so candidates see the AI's feedback appearing in real-time rather than waiting for complete generation, significantly improving perceived responsiveness.

4. Component Library & Design System

  • Atomic Design Architecture: Built 30+ components following atomic design principles—atoms (buttons, inputs, badges), molecules (form fields, stat cards), organisms (interview panels, feedback displays)—with Tailwind CSS design tokens ensuring visual consistency.
  • Accessible by Default: All interactive components built with proper ARIA attributes, keyboard navigation, and focus management, ensuring the platform is usable for candidates relying on assistive technologies.

5. Testing & Deployment Pipeline

  • Testing Pyramid: Unit tests with Jest for business logic and utility functions, integration tests with React Testing Library for component behavior, and Playwright end-to-end tests covering critical user journeys (signup → start interview → receive feedback → review history).
  • CI/CD with GitHub Actions: Automated pipeline that runs linting, type checking, unit tests, and E2E tests on every PR. Successful merges trigger Docker image builds and deployment to AWS infrastructure.
  • Environment Parity: Docker-based local development environment mirrors production configuration, reducing "works on my machine" deployment issues.

Architecture Highlights

AI Orchestration as a First-Class Concern

The LLM integration isn't a bolt-on feature—it's a core architectural concern with its own abstraction layer, error handling strategy, and monitoring. The orchestration layer sits between the API routes and the LLM providers, handling prompt composition, response parsing, token usage tracking, and provider health monitoring. This separation means the rest of the application treats AI capabilities as a reliable internal service rather than dealing with the complexity of external API integration directly.

Rendering Strategy by Route Purpose

Not all pages are rendered the same way. The architecture uses a deliberate per-route rendering strategy:

  • SSR for landing pages and SEO-critical content (interview guides, blog posts) where search engine indexing matters.
  • ISR with revalidation for semi-dynamic content like question category pages that change weekly but don't need real-time freshness.
  • Client-side rendering for the actual interview interface where interactivity and state management take priority over initial load time. This approach avoids the common Next.js anti-pattern of defaulting everything to SSR, which would add unnecessary server load for highly interactive pages.

Authentication & Authorization with Clerk

Clerk Auth provides the identity layer with OAuth support (Google, GitHub), session management, and JWT-based API authentication. Role-based access control gates premium features and admin functionality. The integration is designed so that authentication state is available both server-side (for SSR pages) and client-side (for interactive flows) without duplication.

State Management Philosophy

Zustand was chosen over Redux for its minimal boilerplate and natural fit with React's hook-based patterns. The state architecture separates concerns into discrete stores:

  • Session Store: Active interview state, question progress, timer state.
  • User Store: Profile, preferences, practice history.
  • UI Store: Modal states, sidebar visibility, theme preferences. Each store operates independently with its own persistence strategy—session state uses localStorage for crash recovery, while user data syncs with the backend.

Technical Deep Dive: LLM Orchestration Pipeline

The most architecturally interesting challenge in Preplify is the LLM orchestration pipeline—making AI feel "native" rather than bolted-on.

The Problem: LLM APIs are inherently unreliable compared to traditional REST services. They have variable latency (sometimes 200ms, sometimes 8 seconds), strict rate limits, occasional downtime, and different response formats across providers. Building a production interview platform on top of this requires treating LLM integration as a distributed systems problem, not a simple API call.

The Architecture: The orchestration layer is structured as a pipeline with distinct stages:

  1. Context Assembly: Before any LLM call, the system assembles the full context—user's target role, experience level, practice history, current session state, and any previous answers in the current interview. This context is structured as a typed object, not a raw string, ensuring consistency.

  2. Prompt Composition: System prompts are templated and versioned. Each question type (behavioral, technical, system design) has its own prompt template with slots for dynamic context. This allows non-engineering team members to iterate on prompt quality by modifying templates without touching application code.

  3. Provider Routing: A routing layer selects the appropriate LLM provider based on configurable rules—question complexity, current provider health, and cost considerations. Simple follow-up questions might route to a faster, cheaper model, while initial question generation uses a more capable one.

  4. Response Handling: LLM responses are parsed, validated against expected schemas, and normalized into a provider-agnostic format before being passed to the application layer. Malformed responses trigger automatic retries with adjusted prompts.

  5. Fallback Chain: If the primary provider fails, the system automatically retries with the secondary provider using equivalent prompts. The candidate never sees provider switching—they just experience a slightly longer response time.

Why this approach: Wrapping LLM calls in a simple try/catch and hoping for the best might work for a demo, but production traffic with real users requires the same rigor applied to any critical external dependency. The pipeline approach means each concern (context, prompting, routing, error handling) can be tested and evolved independently.

Key Challenges & Solutions

  • Challenge: LLM response latency varied wildly (200ms to 10+ seconds), making the interview experience feel inconsistent. Approach: Implemented streaming responses via server-sent events so users see feedback appearing in real-time. Added optimistic UI patterns where the interface transitions to "reviewing" state immediately while the AI generates its response. Why: Perceived performance matters more than actual latency. Streaming makes a 5-second response feel interactive rather than broken.

  • Challenge: Interview sessions are long-lived and stateful, but users frequently switch tabs, lose network connectivity, or close the browser accidentally. Approach: Built session state persistence using Zustand middleware that serializes to localStorage on every state change, with a reconciliation mechanism that merges persisted state with server state on reconnection. Why: Losing 20 minutes of interview practice due to an accidental tab close would be a major UX failure. The cost of persistence (minimal localStorage writes) is negligible compared to the value of session continuity.

  • Challenge: GraphQL and REST coexist in the same application, creating potential confusion about which to use for new endpoints. Approach: Established a clear convention: GraphQL for reads that involve relationships or flexible field selection, REST for writes and stateless AI operations. Documented the decision boundary so it's consistent across the team. Why: Hybrid API architectures can become messy without clear conventions. The key is making the choice obvious for each new endpoint rather than debating it every time.

  • Challenge: Different rendering strategies (SSR, ISR, CSR) across routes created complexity in data fetching and authentication patterns. Approach: Created wrapper components that abstract the rendering strategy from the page content. Each page declares its data requirements, and the wrapper handles whether that data comes from server props, incremental regeneration, or client-side fetching. Why: Without this abstraction, every page component would need to know whether it's server-rendered or client-rendered, leaking infrastructure concerns into UI code.

Outcomes

  • Performance: Significantly improved page load times across key user journeys through strategic rendering (SSR/ISR/CSR per route) and aggressive code-splitting at the route level.
  • Development Velocity: The shared component library and design system dramatically reduced time-to-feature for new interview modes and feedback views.
  • Production Reliability: The LLM orchestration layer with failover and circuit breakers maintained consistent AI availability even during provider incidents.
  • Architectural Flexibility: The provider-agnostic AI layer enabled switching between OpenAI and Gemini models for different use cases without client-side changes, providing both cost optimization and capability flexibility.

Engineering Takeaways

Preplify highlighted the technical maturity required to build AI-assisted production products. The key insight is that the AI model is the easy part—the hard part is everything around it: the orchestration layer that handles failure gracefully, the state management that keeps multi-step sessions coherent, and the delivery pipeline that ensures reliability at scale.

Patterns I'd reuse:

  • The provider-agnostic LLM abstraction layer. Every AI-integrated product should treat provider APIs as interchangeable backends behind a stable internal interface.
  • The hybrid GraphQL + REST approach with clear conventions for when to use each. This avoids the "GraphQL for everything" trap while still leveraging its strengths for relational data.
  • Zustand with persistence middleware for long-lived user sessions. The simplicity-to-power ratio is exceptional compared to Redux for this use case.

What I'd reconsider:

  • The prompt templating system could benefit from a more structured format (like a DSL or configuration-driven approach) rather than string templates, as prompt complexity grew faster than anticipated.
  • WebSocket connections for every interview session create server-side resource pressure at scale. A future iteration might explore server-sent events for the feedback direction (server → client) while keeping WebSockets only for bidirectional communication where truly needed.

Trade-offs acknowledged:

  • Chose Zustand over Redux for developer ergonomics, accepting that Redux DevTools and its ecosystem are more mature for debugging complex state flows.
  • Chose Clerk over custom auth to accelerate development, accepting the vendor dependency and per-user pricing in exchange for not building and maintaining auth infrastructure.
  • Chose a hybrid API architecture over pure GraphQL, accepting the maintenance cost of two API paradigms in exchange for using each where it naturally fits.