Design a URL Shortener: A Complete Walkthrough
Step-by-step system design for a URL shortener like Bitly. Covers hashing algorithms, base62 encoding, database schema, caching layer, and analytics tracking.
Tags
Design a URL Shortener: A Complete Walkthrough
This post applies concepts from the System Design from Zero to Hero series.
TL;DR
A URL shortener maps long URLs to short keys using base62 encoding or hash functions, backed by a key-value store with caching for high-throughput read-heavy workloads. The core challenge is generating globally unique short codes at scale without collisions, while serving redirects with single-digit millisecond latency. This post walks through the design from requirements to a production-ready architecture.
Requirements
Functional Requirements
- ›Shorten URL — Given a long URL, generate a unique short URL (e.g.,
short.ly/a3Bc9x). - ›Redirect — When a user visits the short URL, redirect them to the original long URL.
- ›Custom aliases — Users can optionally provide a custom short code (e.g.,
short.ly/my-brand). - ›Expiration (TTL) — URLs can have an expiration date after which they stop redirecting.
- ›Analytics — Track click counts, referrer, geographic location, and device type per short URL.
Non-Functional Requirements
- ›High availability — Redirects must work 99.99% of the time. A broken short link is a broken user experience.
- ›Low latency — Redirects should complete in under 10ms at p99.
- ›Read-heavy — The read-to-write ratio is approximately 100:1. Reads dominate.
- ›Scalability — Must handle billions of stored URLs and thousands of redirects per second.
- ›Non-guessable — Short codes should not be trivially sequential to prevent enumeration.
Back-of-Envelope Estimation
Assume 100 million new URLs created per month and a 100:1 read-to-write ratio:
- ›Writes: 100M / (30 days * 24 hours * 3600 seconds) ≈ ~40 writes/second
- ›Reads: 40 * 100 = ~4,000 redirects/second (average), with peaks of ~20,000/second
- ›Storage per record: short code (7 bytes) + long URL (average 200 bytes) + metadata (100 bytes) ≈ 300 bytes
- ›Storage over 5 years: 100M * 12 * 5 * 300 bytes = ~1.8 TB total
A 7-character base62 code gives us 62^7 ≈ 3.5 trillion possible combinations, which is more than sufficient.
High-Level Design
The system consists of the following components:
Client → Load Balancer → API Servers → Cache (Redis) → Database (KV Store)
↓
Analytics Service → Message Queue → Analytics DB
Write path: The client sends a long URL to an API server. The server generates a unique short code, stores the mapping in the database, and returns the short URL.
Read path: The client requests a short URL. The API server first checks the Redis cache. On a cache hit, it returns the redirect immediately. On a miss, it queries the database, populates the cache, and redirects the user.
Analytics path: Every redirect event is published to a message queue asynchronously. A separate analytics service consumes these events and writes them to a columnar analytics database.
Detailed Design
Short Code Generation
This is the most critical design decision. There are three main approaches:
Approach 1: MD5/SHA256 Hash + Base62 Encoding
Hash the long URL, take the first 7 characters of the base62-encoded hash. The problem is collisions: two different URLs could produce the same 7-character prefix. You must check the database for conflicts and rehash with a salt if a collision occurs.
import hashlib
import base64
def generate_short_code(long_url: str) -> str:
hash_bytes = hashlib.md5(long_url.encode()).digest()
base62 = base64.b64encode(hash_bytes).decode()
# Take first 7 alphanumeric characters
code = ''.join(c for c in base62 if c.isalnum())[:7]
return codeDownside: Collision checking adds latency and complexity. The same URL always produces the same hash, which can be a feature (deduplication) or a bug (different users want separate short URLs for the same long URL).
Approach 2: Auto-Increment Counter + Base62 Encoding
Use a globally unique auto-incrementing counter. Convert the counter value to base62. This guarantees zero collisions.
BASE62_CHARS = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
def to_base62(num: int) -> str:
if num == 0:
return BASE62_CHARS[0]
result = []
while num > 0:
result.append(BASE62_CHARS[num % 62])
num //= 62
return ''.join(reversed(result))Downside: Codes are sequential and predictable. An attacker can enumerate all URLs. Mitigation: shuffle the bits of the counter before encoding.
Approach 3: Pre-Generated Key Ranges (Recommended)
A dedicated Key Generation Service (KGS) pre-generates random 7-character codes and stores them in a database with two tables: unused_keys and used_keys. When an application server needs a key, it fetches a batch (say 1,000 keys) from unused_keys and moves them to used_keys. Each app server holds its batch in memory for fast assignment.
This approach eliminates collision checking at write time, avoids sequential predictability, and scales horizontally because each server operates on its own key range.
301 vs 302 Redirects
This is a common interview discussion point:
- ›301 (Permanent Redirect): The browser caches the redirect. Subsequent visits bypass your server entirely. Better for latency, but you lose analytics data for repeat visits.
- ›302 (Temporary Redirect): The browser does not cache the redirect. Every visit hits your server. Slightly higher latency, but you capture every click for analytics.
Recommendation: Use 302 if analytics matter to your business. Use 301 if raw performance is the priority and you can rely on client-side analytics (like UTM parameters) instead.
Custom Aliases
When a user provides a custom alias like short.ly/my-brand, the system must check if the alias is already taken. Custom aliases bypass the code generation logic and go directly to a uniqueness check against the database. Reserve a set of blocked words and system paths (e.g., /api, /admin, /health) that cannot be used as custom aliases.
TTL and Expiration
Store an expires_at timestamp alongside each URL mapping. During redirect:
- ›Check if
expires_atis set and has passed. - ›If expired, return a 410 Gone response.
- ›A background cleanup job periodically deletes expired entries and recycles their short codes back to the key pool.
Caching Hot URLs
The read path is heavily skewed: a small percentage of URLs receive the majority of traffic (Zipf distribution). A caching layer absorbs this hot traffic. For caching strategies in depth, see Part 5: Caching Strategies.
Cache strategy:
- ›Cache-aside pattern: On a read, check Redis first. On a miss, query the database, then populate Redis.
- ›TTL on cache entries: Set a cache TTL of 24–48 hours. Hot URLs stay cached; cold URLs naturally evict.
- ›Cache capacity: If 20% of URLs handle 80% of traffic, caching ~20% of the dataset is sufficient. With 6 billion records at 300 bytes each, 20% = ~360 GB. A Redis cluster can handle this.
GET short.ly/a3Bc9x
→ Check Redis: key="a3Bc9x"
→ Cache HIT: return long_url, log analytics event
→ Cache MISS: query DB, SET Redis key with TTL, return long_url
Analytics Tracking
Every redirect publishes an event to a message queue (Kafka) with:
{
"short_code": "a3Bc9x",
"timestamp": "2025-08-05T14:23:00Z",
"referrer": "https://twitter.com",
"user_agent": "Mozilla/5.0...",
"ip_address": "203.0.113.42",
"country": "US"
}A stream processing consumer aggregates these events and writes to a columnar database (ClickHouse) for dashboard queries. This keeps the redirect path fast because analytics writes are asynchronous and do not block the redirect response.
Data Model
URL Mapping Table (Primary Store — DynamoDB or similar KV store)
| Column | Type | Description |
|---|---|---|
| short_code | VARCHAR(7) | Primary key, the short URL code |
| long_url | TEXT | The original URL |
| user_id | VARCHAR | Creator of the short URL |
| created_at | TIMESTAMP | Creation time |
| expires_at | TIMESTAMP | Optional expiration time |
| is_custom | BOOLEAN | Whether this is a custom alias |
For database selection guidance, see Part 4: Database Deep Dive.
Analytics Events Table (ClickHouse)
| Column | Type | Description |
|---|---|---|
| short_code | VARCHAR | Foreign key to URL mapping |
| clicked_at | DATETIME | Timestamp of the click |
| referrer | VARCHAR | HTTP referrer header |
| country | VARCHAR | Geo-IP resolved country |
| device_type | VARCHAR | mobile / desktop / tablet |
| browser | VARCHAR | Parsed from user agent |
Key Pool Tables (Used by KGS)
| Table | Column | Type |
|---|---|---|
| unused_keys | short_code | VARCHAR(7) |
| used_keys | short_code | VARCHAR(7) |
Scaling Considerations
Database sharding: Shard the URL mapping table by short code. Since short codes are random (from KGS), traffic distributes evenly across shards. Range-based sharding on short code prefixes works well here.
Read replicas: Given the 100:1 read-write ratio, deploy multiple read replicas. The cache absorbs most read traffic, so replicas handle cache misses.
Load balancing: Use a stateless API server design behind a load balancer. Any server can handle any request since all state lives in the database and cache.
CDN for redirects: For extremely popular URLs, a CDN edge can cache the 302 redirect response itself. The CDN checks the short_code against a cached mapping and redirects without hitting the origin. This drops latency to single-digit milliseconds globally.
Rate limiting: Apply rate limits on URL creation to prevent abuse (spam link generation). The redirect endpoint should have generous limits since blocking redirects degrades user experience.
Trade-offs and Alternatives
| Decision | Option A | Option B | Recommendation |
|---|---|---|---|
| Code generation | Hash-based | Pre-generated keys | Pre-generated keys for zero collision overhead |
| Redirect type | 301 permanent | 302 temporary | 302 if analytics needed, 301 otherwise |
| Database | Relational (PostgreSQL) | KV store (DynamoDB) | KV store for simple access patterns; add analytics DB separately |
| Analytics | Synchronous writes | Async via queue | Async to keep redirect path fast |
| Cache eviction | LRU | TTL-based | TTL with LRU as fallback |
Why not a relational database? The access pattern is exclusively key-value: given a short code, return the long URL. There are no joins, no complex queries on the URL mapping itself. A KV store like DynamoDB offers single-digit millisecond reads at any scale with minimal operational overhead. If you need relational features for user management or billing, run a separate PostgreSQL instance for those concerns.
Why not use UUIDs? A UUID is 36 characters, defeating the purpose of a short URL. Even a shortened UUID (first 7 characters) has collision probability issues similar to hash truncation without the deduplication benefit.
FAQ
How do URL shorteners generate unique short codes?
Common approaches include auto-incrementing IDs converted to base62, MD5/SHA256 hash truncation with collision detection, or pre-generated key ranges distributed to application servers for zero-collision writes. The pre-generated key approach is preferred at scale because it eliminates the collision check from the write path entirely. Each application server grabs a batch of unused keys on startup and assigns them locally, making writes extremely fast.
How would you handle billions of URLs in a URL shortener?
Use database sharding by short-code hash, cache popular URLs in Redis, store analytics asynchronously via message queues, and use a CDN for redirect responses to reduce origin server load. The key insight is that URL shorteners are read-heavy, so the architecture should optimize the read path aggressively. A multi-tier caching strategy — CDN at the edge, Redis at the application layer, and read replicas at the database layer — ensures that the vast majority of redirects never touch the primary database.
What database is best for a URL shortener?
A key-value store like DynamoDB or a NoSQL database works well due to simple access patterns. For analytics and reporting features, pair it with a columnar database like ClickHouse for aggregations. The URL mapping itself is a textbook key-value workload: one key (short code) maps to one value (long URL). There is no need for relational joins or complex queries on this data. Keep the analytics concern separate in a purpose-built OLAP store.
Collaboration
Need help with a project?
Let's Build It
I help startups and established companies design, build, and scale world-class digital products. From deep technical architecture to pixel-perfect UI — let's bring your vision to life.
Related Articles
Design an E-Commerce Order Processing System
Design a fault-tolerant e-commerce order system with inventory management, payment processing, saga pattern for transactions, and event-driven order fulfillment.
Monitoring, Observability, and Site Reliability
Build observable systems with structured logging, distributed tracing, and metrics dashboards. Learn SRE practices including SLOs, error budgets, and incident response.
CAP Theorem and Distributed Consensus
Understand the CAP theorem, its practical implications, and distributed consensus algorithms like Raft and Paxos. Learn how real databases handle partition tolerance.