URL Shortener · System DesignHard

Real-World Usage Architecture Requirements Estimation System APIs nginx API · Base58 ID Management Redis PostgreSQL Kafka ClickHouse Reliability Interview Q&A

URL Shortener · System Design Deep Dive

HardSystem DesignDistributed SystemsCachingEvent Streaming·MetaGoogleAmazonTwitter

Build a service like bit.ly: take a long URL, return a short alias, redirect on click. The classic system design interview problem.

We use real production components: nginx, Node.js, Redis, PostgreSQL, Kafka, and ClickHouse.

Where This Is Actually Used

URL shorteners are everywhere. Most major platforms run one in-house, both as a product and as internal plumbing for analytics, sharing, and tracking.

Real services running this exact design

Service	Owner	Purpose
`t.co`	Twitter / X	Every link posted on Twitter is wrapped in `t.co` for click tracking and abuse filtering.
`bit.ly`	Bitly Inc.	The flagship commercial shortener. ~12B clicks/month. Powers branded short links for thousands of businesses.
`youtu.be`	YouTube	Short alias for `youtube.com/watch?v=…`. Used in shares, embeds, mobile apps.
`amzn.to`	Amazon	Affiliate-friendly product links. Hides long ASIN URLs in influencer posts.
`wa.me`	WhatsApp / Meta	Direct-message a phone number (e.g. `wa.me/15551234`). Powers "Click to chat" buttons everywhere.
`lnkd.in`	LinkedIn	Profile and post links in InMail, ads, mobile sharing.
`fb.me`	Meta	Facebook's short URL for posts, pages, events. Heavily used in SMS share flows.
`git.io`	GitHub (retired 2022)	Short links for GitHub URLs. Shut down because of abuse — exactly the failure mode we discuss in Reliability.
`goo.gl`	Google (sunset 2018)	Standalone shortener. Discontinued: hard to monetize on its own, Google rolled the feature into Firebase Dynamic Links.
`tinyurl.com`	Independent	The OG, launched 2002. Still alive, still serving.

Where short URLs actually show up

Use case	Why a short URL matters
Social media posts	Twitter's 280-char limit, Instagram bio (1 link), TikTok captions. Every character saved is room for commentary.
SMS marketing	160-char SMS. A long URL eats half the message. Short URL = more pitch room + better CTR.
QR codes	Shorter URL → smaller QR code with lower density → scans reliably from further away, prints smaller on a poster.
Print advertising	People type URLs from magazines, billboards, business cards. Nobody types `https://store.example.com/spring-sale-2026?utm_source=billboard&utm_id=42`.
Email marketing	Cleaner CTAs, no long URL line-wrapping, easy to swap targets without changing the link.
UTM tracking hidden	50+ char UTM parameters tucked behind a short URL. Marketers get the analytics; users see a clean link.
Affiliate links	Influencer codes (`amzn.to/3xY9`). Referral attribution survives copy-paste.
Branded short links	Companies pay for `your-brand.co/promo` instead of `bit.ly/3xy9`. The short domain is a brand asset.
Internal go-links	Engineering teams run private shorteners: `go/oncall`, `go/runbook`. Famously used at Google, Facebook, Stripe. Same architecture, internal-only.
App deep links	One short URL routes to the iOS app, the Android app, or a web fallback. Firebase Dynamic Links, Branch.io, Adjust.

The hidden value: the short URL itself is almost free to generate. What users pay for is click analytics, link management, and branded domains. Bitly's entire business is built on the analytics pipeline (the Kafka → ClickHouse layer in this design), not the encoding.

Abuse risk: shorteners are catnip for phishing because they hide the destination. Production services run URL safety scanning on creation (Google Safe Browsing, PhishTank), rate-limit aggressively per IP and per account, and many add interstitial warnings for suspicious destinations. git.io was shut down precisely because abuse exceeded what was sustainable.

Architecture · Trace a Real Request

Three flows below: create a short URL, redirect on cache hit (~2 ms), redirect on cache miss (~11 ms). Hover any component for details. Use the fullscreen button to expand the diagram.

Simulate Real Traffic

Hover components for details

EDGE

APPLICATION

DATA

ANALYTICS

Browser

User Agent

nginx

Reverse Proxy + LB

Node.js API

App Server

Redis

Cache

PostgreSQL

Primary DB

Kafka

Event Stream

Worker

Consumer

ClickHouse

Analytics OLAP

Wire-Level Trace

Click a flow above to trace packets…

Requirements

Functional

Short URL generation: generate a unique short alias for any long URL.
Redirection: visiting the short URL redirects to the original.
Custom short links: users can request their own alias (e.g. sho.rt/my-blog).
Deletion: authorized users can delete a short URL.
Update: authorized users can change the long URL behind an existing short link.
Expiry: a default TTL applies, but users can set a custom expiration. Expired URLs are deleted after 5 years even if unused so the datastore index does not grow unbounded.
Per-URL click analytics (count, geo, referrer).

Non-Functional

Availability: 99.99% (4.3 min downtime / month). The domain is part of every generated URL, so downtime breaks every link.
Scalability: horizontal scaling at every layer (LB, API, cache, DB).
Readability: short links must be easy to type. No visually ambiguous characters.
Latency: p99 redirect < 100ms.
Unpredictability: short URLs must not be guessable. Sequential IDs leak link existence (just decrement to find your neighbor's link).
Read-heavy: 1:100 write-to-read ratio. Cache must absorb most traffic.

Back-of-Envelope Estimation

Size first, design second. These numbers drive every later decision.

Assumptions

200 million new URLs per month.
Write-to-read ratio 1:100 (heavily read-dominated).
Each row uses 500 bytes on disk.
Each URL retained for 5 years unless explicitly deleted.
100 million daily active users (DAU).

Metric	Calculation	Result
Total URLs over 5 years	200M × 12 × 5	12 billion entries
Storage	12B × 500 bytes	~6 TB
Writes / sec	200M ÷ 2,628,288 sec/month	76 writes/s
Reads / sec	76 × 100 (read ratio)	7,600 redirects/s
Incoming bandwidth (writes)	76 × 500B × 8	304 Kbps
Outgoing bandwidth (reads)	7.6K × 500B × 8	30.4 Mbps
Cache memory (80/20 rule)	20% × 7.6K × 86,400 × 500B	~66 GB
App servers (peak DAU as proxy)	100M req/s ÷ 64K RPS per server	~1,600 servers
Short code space	58⁷ (7-char Base58)	2.2 trillion codes

Key insight: 7,600 redirects/sec is heavy but cacheable. By the 80/20 rule, ~20% of URLs absorb ~80% of traffic. Caching that hot set fits in ~66 GB and lets Redis serve most queries in < 1ms. PostgreSQL only sees the long tail.

System APIs

Three REST endpoints expose the service. Every endpoint requires an api_dev_key for rate-limit accounting and abuse tracking.

POST /shorten · create a short URL

POST /shorten
{
  "api_dev_key":  "...",         // user identifier (required)
  "original_url": "https://...", // long URL to shorten (required)
  "custom_alias": "my-blog",     // optional user-chosen code
  "expiry_date":  "2031-05-14"   // optional, default 5 years
}

→ 201 Created
{ "short_url": "https://sho.rt/2JjVS" }

GET /:url_key · redirect

GET /:url_key?api_dev_key=...

→ 302 Found
Location: https://example.com/article/long-path

DELETE /:url_key · remove a short URL

DELETE /:url_key
{ "api_dev_key": "..." }   // must match owner

→ 200 OK
{ "message": "URL removed" }

Update is the same as create with the same custom_alias: the API replaces the long URL behind the existing short code (subject to ownership check).

nginx · Edge Gateway

nginx is the entry point. Every request hits it first. Three jobs:

TLS termination: decrypts HTTPS once at the edge.
Rate limiting: blocks abuse via token bucket.
Load balancing: spreads traffic across stateless API replicas using least_conn.

Three swim lanes below, one per client IP. Each IP has its own independent token bucket. Click Burst 9 on one row to flood that IP. Only that bucket drains. The other two stay full. That is the per-IP isolation nginx gives you.

nginx limit_req · Per-IP Token Bucket

Each client IP gets its own bucket of 5 tokens. Refill 1 token every 1500ms. Burst one IP to watch only that bucket empty while others stay full.

Client IP

198.51.100.7

Token Bucket5 / 5

GATE

Last:

Client IP

203.0.113.42

Token Bucket5 / 5

GATE

Last:

Client IP

192.0.2.18

Token Bucket5 / 5

GATE

Last:

api-1.svc0

api-2.svc0

api-3.svc0

429 REJECTED

How nginx limit_req works: Each client IP gets its own token bucket of 5 tokens. Every request consumes 1 token from that IP's bucket. When empty, nginx returns 429 Too Many Requests. Other IPs are unaffected. Allowed requests are forwarded to the least-loaded API replica.

Production config

http {
    # Token bucket: 5 req/s per IP, burst up to 10
    limit_req_zone $binary_remote_addr zone=urlapi:10m rate=5r/s;

    upstream api {
        least_conn;
        server api-1.svc:3000 max_fails=2 fail_timeout=10s;
        server api-2.svc:3000 max_fails=2 fail_timeout=10s;
        server api-3.svc:3000 max_fails=2 fail_timeout=10s;
        keepalive 64;
    }

    server {
        listen 443 ssl http2;
        ssl_certificate     /etc/ssl/cert.pem;
        ssl_certificate_key /etc/ssl/key.pem;

        location / {
            limit_req zone=urlapi burst=10 nodelay;
            limit_req_status 429;
            proxy_pass http://api;
        }
    }
}

Node.js API · Base58 Encoding

The API is stateless. Any replica can serve any request. The interesting bit is how it generates short codes. Three approaches:

Approach	Verdict	Why
MD5 hash truncation	✗	Collisions need retry logic.
Random Base58 string	✗	Expensive UNIQUE check on every insert.
Counter + Base58	✓	Atomic counter, collision-free by construction.

id   = redis.INCR("url:counter")    // atomic, globally unique
code = base58_encode(id)             // collision-free

Base58 uses 58 URL-safe characters: 1-9, A-Z without I/O, a-z without l. A 64-bit ID fits in at most 11 Base58 characters. 58⁷ ≈ 2.2 trillion 7-char codes, far more than we need. Step through the encoding below:

Base58 Encoder · divmod loop

counter ID → short code · 58 URL-safe chars · no 0 O I l + / confusion

Try:

Why Base58? Like converting to binary, but with 58 symbols. Repeatedly divide by 58. Each remainder is a digit. Digits come out least-significant first, so we prepend each character. Base58 deliberately drops 0 O I l + / so a hand-typed short URL is never misread.

Step

n (input)

÷ 58 →

q (next n)

rem r

ALPHABET[r]

result (prepend ←)

Press ▶ Encode to step through the divmod loop.
Each row will reveal one division step.

ALPHABET · 58 chars (1-9, A-Z minus I/O, a-z minus l)

Why Base58, not Base62 or Base64? Base64 includes + and / which break URL parsing. Base62 still has visually ambiguous characters: 0 vs O, I vs l vs 1. Base58 drops 0, O, I, l, +, / so a short URL is never misread when someone types it from a printed flyer or reads it aloud.

Sequencer lifetime: 64 bits gives 1.8 × 10¹⁹ IDs. At 2.4 billion new URLs per year, the counter takes ~7.7 billion years to overflow. We will not run out.

Unpredictability:a pure sequential counter would leak link existence (just decrement to find your neighbor's link). Fix: each app server is assigned a range of IDs and picks randomly within its range per request. The underlying counter is still monotonic and collision-free, but the generated short codes look random.

ID Management · Custom URLs · Scope

Scope of the short URL generator

A 64-bit sequencer can produce numbers from 1 to 2⁶⁴ − 1. We need to constrain that range so every code is between 6 and 11 characters:

Minimum length 6: start the sequencer at 1,000,000,000 (1 billion). Anything smaller would encode to fewer than 6 Base58 chars.
Maximum length 11: a 64-bit value uses up to log₂(2⁶⁴) / log₂(58) ≈ 10.9 Base58 digits. Round up to 11. The longest possible short URL is 11 chars.

Bits-per-digit math: base-10 packs log₂(10) ≈ 3.32 bits per digit. Base-58 packs log₂(58) ≈ 5.85 bits per digit. So a 64-bit ID takes ~20 decimal digits but only ~11 Base58 chars. That is the readability win, made tangible.

Custom short links

A user can ask for sho.rt/my-blog instead of an auto-generated alias. The flow:

Validate format: only Base58 chars, length 6 to 11.
Decode to base-10: compute the numeric ID the alias would correspond to.
Check availability: look up that ID in the database.
Reserve the ID: if free, insert the row and mark the ID as used so the auto-sequencer never re-issues it.

Used / Unused ID tracking

The sequencer generates IDs into an "unused" pool. Once an ID is allocated (either by the sequencer or by a custom alias claim), it moves to the "used" pool. This guarantees a one-to-one mapping between numeric IDs and short URLs and prevents collisions.

// On custom-alias request:
const id     = base58_decode("my-blog");          // e.g. 9,181,722,813
const exists = await pg.query(
  "SELECT 1 FROM urls WHERE id = $1 OR short_code = $2",
  [id, "my-blog"]
);
if (exists) return { error: "alias unavailable" };

await pg.query(
  "INSERT INTO urls (id, short_code, long_url, user_id) VALUES ($1, $2, $3, $4)",
  [id, "my-blog", longUrl, userId]
);
// id is now in the "used" pool. Sequencer skips it on next INCR.

Base58 → base-10 (decoding)

Decoding the alias back into a numeric ID is straightforward positional notation: each char's index × 58^position, summed.

// "2JjVS" → 14,776,337
//
//   2  → idx  1  × 58⁴ = 11,316,496
//   J  → idx 17  × 58³ =  3,316,904
//   j  → idx 42  × 58² =    141,288
//   V  → idx 28  × 58¹ =      1,624
//   S  → idx 25  × 58⁰ =         25
//   ─────────────────────────────────
//                    sum = 14,776,337  ✓

Deletion + expiry

Deletion is a soft remove first (so click analytics aren't orphaned), hard delete after a grace window. Expired short URLs are purged after 5 years even if never accessed: the datastore index would otherwise grow without bound, blowing up query latency.

-- Daily cleanup job
DELETE FROM urls
WHERE expires_at IS NOT NULL
  AND expires_at < now() - INTERVAL '5 years';

Redis · In-Memory Cache

Redis sits between the API and PostgreSQL. For our 1:100 write-to-read ratio, ~90% of redirects should hit Redis. That is the difference between a healthy system and a database on fire.

What Redis actually is

Redis is a fast in-memory data structure server, not just a KV cache. It supports strings, hashes, lists, sets, sorted sets, and pub/sub. For us it is mostly a string store: SET url:00000Q "https://...". Sub-millisecond reads because everything lives in RAM.

Sharding via hash slots

A single Redis node maxes out around 100K ops/sec. Not enough for us. Redis Cluster splits the keyspace into 16,384 hash slots. Each shard owns a contiguous range. The slot for any key is CRC16(key) mod 16384. Resharding moves slots without downtime.

Redis Cluster · 16,384 Hash Slots

slot = CRC16(key) mod 16384 · each shard owns a contiguous slot range

Try a key

CRC16("url:00000Q") = 3724

3724 mod 16384 = slot 3724

slot 3724 → shard-1 (range 2730–5459)

Why 16,384? Redis chose 2¹⁴ slots as a balance: enough granularity for fine-grained resharding (you can move one slot's worth of keys at a time), but small enough that the cluster bus message stays tiny (16,384 bits = 2KB per cluster heartbeat). When a shard goes down, Sentinel promotes a replica and updates the slot map. Clients learn the new owner via MOVED redirects.

Persistence: RDB vs AOF

Redis can survive restarts via two persistence modes (you can combine them):

Mode	How	Trade-off
RDB (snapshots)	Periodic `fork()` writes entire dataset to disk.	Fast restart. Lose up to N minutes on crash.
AOF (append log)	Append every write command to a log. Replay on restart.	At most 1 second of data loss with `fsync=everysec`. Slower restart.

Failover via Sentinel

Each primary has 1 to 2 replicas. Redis Sentinel monitors primaries. If a quorum agrees a primary is dead, Sentinel promotes a replica and updates clients. Failover happens in seconds.

Cache code (cache-aside pattern)

async function getLongUrl(code: string): Promise<string | null> {
  const cached = await redis.get(`url:${code}`);
  if (cached) return cached;                             // HIT: 90% of traffic

  const row = await pg.query(
    "SELECT long_url FROM urls WHERE short_code = $1",
    [code]
  );
  if (!row) return null;

  await redis.setex(`url:${code}`, 3600, row.long_url);   // populate for 1h
  return row.long_url;
}

Cache stampede: when a hot key expires, thousands of requests miss simultaneously and hammer PostgreSQL. Fixes: jittered TTLs, single-flight locking, probabilistic early refresh.

PostgreSQL · Source of Truth

Redis is fast but volatile. PostgreSQL is durable. Every URL mapping lives here.

Schema

CREATE TABLE urls (
  id          BIGSERIAL    PRIMARY KEY,
  short_code  VARCHAR(8)   NOT NULL UNIQUE,
  long_url    TEXT         NOT NULL,
  user_id     BIGINT,
  created_at  TIMESTAMPTZ  NOT NULL DEFAULT now(),
  expires_at  TIMESTAMPTZ,
  click_count BIGINT       NOT NULL DEFAULT 0
);

CREATE INDEX idx_short_code ON urls USING btree (short_code);

B-tree index lookup

The UNIQUE constraint creates a B-tree index. Lookups take O(log n) page reads. For our 100M-row table, that is ~3 page reads, ~9ms on cold cache.

PostgreSQL · B-tree Index Lookup

SELECT long_url FROM urls WHERE short_code = '2JjVS' · O(log n)

B-tree index on (short_code)

ROOT

2NVjA

2W5GH

4GqPx

INTERNAL

2JJ5K

2JjVS

2JqM3

LEAF

2J2Aa

2JJ5K

p12/s4

p12/s7

2JV2K

2JjVS

p12/s9

p12/s12

2Jq1k

2JqM3

p13/s2

p13/s5

…

p0/s0

Heap page · page 12

Heap page contents will appear here when the leaf points to a row.

Streaming WAL Replication

PRIMARY

leader

accepts writes · 1 instance

wal_sender → wal_receiver

REPLICA 1

read-only · ~10ms lag

REPLICA 2

read-only · ~12ms lag

How writes propagate: Each INSERT appends to the WAL (Write-Ahead Log) on the primary. A wal_sender process streams each record to every replica's wal_receiver, which replays them in order. Replicas serve SELECT traffic, scaling reads for our 1:100 write-to-read ratio.

Read replicas via streaming replication

One primary cannot absorb 7,600 reads/sec at sustained load. PostgreSQL streams every WAL record to replicas in near-real-time (~10ms lag). The API routes SELECT queries to replicas and INSERT to the primary.

Sharding when one primary is not enough: partition by hash(short_code) into N shards. Each shard handles 1/N of writes. PostgreSQL does not do this natively; use Citus or app-level sharding.

Kafka · Async Click Analytics

The redirect path must be fast. Doing UPDATE click_count++ on every redirect would create a hot row, lock contention, and tank latency. Solution: the API publishes a click event to Kafka and returns the 302 immediately. A worker pool consumes events and writes batched rollups to ClickHouse.

Partitioned log

The clicks topic has 3 partitions (production: dozens). Same key always lands in the same partition. This preserves per-URL ordering. Each partition is an immutable append-only log with its own consumer offset.

Kafka · Topic 'clicks' · 3 partitions

partition = hash(short_code) mod 3 · replication factor 3 · consumer group tracks offset per partition

API Server→ produce(key=short_code, value=click_event)

partition-0

HEAD 0lag 0

consumer offset0

partition-1

HEAD 0lag 0

consumer offset0

partition-2

HEAD 0lag 0

consumer offset0

Worker (consumer group: analytics)→ commits offset after batch insert to ClickHouse

Why partitions? Three concurrent workers can each own one partition for perfect parallelism. Same-key messages always go to the same partition, so click counts for one URL are processed in order. Replication factor 3 means each partition is on 3 brokers; if one dies, the controller promotes a follower. Unlike the AWS S3 metadata pattern, Kafka contains the blast radius via leader election per partition.

Why not Redis pub/sub or a DB queue?

Redis pub/sub: at-most-once delivery. Worker restart loses events.
DB-as-queue: fine at low scale. Caps out around 5K msg/s.
Kafka: durable, replayable, partitioned. Millions of msg/sec.

ClickHouse · Analytics OLAP

PostgreSQL is a row store. ClickHouse is columnar. Row stores are great for SELECT * on one row, terrible at SUM(clicks) GROUP BY day over billions. Columnar reads only the columns it needs and compresses them 10 to 100 times.

Query	PostgreSQL (row)	ClickHouse (column)
`SELECT * WHERE id=42`	~1 ms	~50 ms (overhead)
`SUM(clicks) GROUP BY day` over 1B rows	Minutes	~100 ms
`INSERT 1 row`	Fast	Slow (batch only)

The worker batches click events (e.g. 1000 per insert) from Kafka into ClickHouse. Dashboards query ClickHouse, never PostgreSQL. This is the standard split: OLTP for the hot path, OLAP for analytics.

Reliability · What Goes Wrong at Scale

At scale, failure is not the exception. It is a daily event. With 10,000 disks across the fleet and MTTF of 10 years per disk, expect ~3 disk failures every single day. Reliable systems do not try to eliminate failure. They survive it.

Hardware faults vs systematic faults

Type	Behavior	How we handle it
Hardware	Usually independent. One disk fails, others do not.	Replicas (PostgreSQL streaming, Kafka RF=3, Redis Sentinel).
Systematic	Correlated. One bug crashes every replica at once.	Far more dangerous. Canary deploys, feature flags, gradual rollouts.

Three real failure modes

Scenario 1: Redis dies entirely. Every redirect hits PostgreSQL. 11,600 reads/sec on the primary instead of ~1,160. Connection pool fills. Timeouts cascade backward. nginx returns 504s.

Mitigation: circuit-breaker on PostgreSQL calls. Return cached-stale when latency exceeds threshold. Better degraded than down.

Scenario 2: A bad deploy ships to all API replicas at once. This is the Knight Capital 2012 story. One bad rollout lost $440M in 45 minutes.

Mitigation: canary deploys. Ship to 1 replica, watch error rate for 5 minutes, then 10%, then 50%, then 100%. Make rollback instant.

Scenario 3: Cache stampede. A celebrity tweets a short URL right as the cache key expires. 50K requests miss simultaneously and stampede PostgreSQL.

Mitigation: jitter TTLs (3600 ± 600s) so keys do not expire in lockstep. Use a Redis lock so only one request refills.

Reliability principles

Replicate everything. PostgreSQL streaming, Kafka RF=3, Redis Sentinel.
GSLB (Global Server Load Balancing). Route traffic to the nearest healthy data center. On regional failure, GSLB drains the bad region automatically. Eventual consistency between regions is fine here because a brand-new short URL is typically not accessed for a few seconds, leaving time for replication.
Daily backups to durable object storage (e.g. S3). Worst-case recovery loses URLs created since the last snapshot. RPO measured in hours, not days.
Idempotent writes. A worker may consume the same Kafka message twice. Inserts must tolerate that.
Async over sync. Click tracking is async. A Kafka outage does not break redirects.
Health checks. nginx removes unhealthy API replicas via max_fails.
Rate limit per api_dev_key. Fixed-window counter is enough at this scale. Protects against DoS and accidental loops.
Observability. Trace every request, alert on p99, dashboard error rates per endpoint.
Blast-radius limits. One bad replica should not take down the cluster.

Requirements compliance summary

Requirement	How we meet it
Availability	Replication at every layer, GSLB across regions, daily S3 backups, rate limiters at the edge.
Scalability	Stateless API replicas, Redis Cluster hash slots, PostgreSQL read replicas + hash-sharding when needed, Kafka partitioning.
Readability	Base58 removes 0/O/I/l ambiguity and avoids URL-unsafe +/.
Latency	Redis cache absorbs ~90% of traffic at <1ms. PostgreSQL B-tree at ~9ms on miss. Encoding is O(1).
Unpredictability	Random ID selection within each server's assigned range. Optional salt before encoding hardens against guessing.

Interview Follow-ups

Q1: How do you prevent collisions in Base58 codes?

By construction. We use a counter (INCR), not a hash. Atomic INCR is globally unique. Hash/random needs retry-on-collision, which does not scale.

Q2: 301 or 302 for redirects?

302. A 301 is cacheable forever by browsers and CDNs. They never hit our service again, so we lose analytics. 302 is temporary, not cached, every click reaches us.

Q3: How do you handle 10× more traffic?

nginx: add LB instances behind a TCP load balancer.
API: already stateless. Just add replicas.
Redis: add shards (move hash slots).
PostgreSQL: add read replicas, then shard by hash(short_code).
Kafka: increase partition count, add consumers.

Q4: What if Kafka is down?

Redirects still work.Kafka is off the hot path. The producer buffers events in memory. If Kafka stays down past the buffer limit, we drop events. Click counts go stale, the core service stays up. That is async's whole point.

Q5: A user generates 100K URLs in a minute. What happens?

nginx limit_req caps them at 5 req/s per IP. They get 429s. For distributed attackers, add per-user-account limits at the API level and CAPTCHA on signup.

Q6: Support custom aliases like `sho.rt/my-blog`?

Same table, just INSERT the user-supplied short_code. The UNIQUE constraint protects against collisions. Reserve a namespace prefix (e.g. generated codes start with a digit) so user aliases cannot collide with the counter sequence.

Q7: What is the trickiest failure mode?

Silent data corruption. If you write to Redis first and PostgreSQL fails, the user sees a working short URL that was never persisted. Always write to PostgreSQL first (source of truth), then populate cache. Never the other way around.

Discussion

…

Loading comments…

LeetMotion

URL Shortener · System Design Deep Dive

Where This Is Actually Used

Real services running this exact design

Where short URLs actually show up

Architecture · Trace a Real Request

Simulate Real Traffic

Wire-Level Trace

Requirements

Functional

Non-Functional

Back-of-Envelope Estimation

Assumptions

System APIs

POST /shorten · create a short URL

GET /:url_key · redirect

DELETE /:url_key · remove a short URL

nginx · Edge Gateway

nginx limit_req · Per-IP Token Bucket

Production config

Node.js API · Base58 Encoding

Base58 Encoder · divmod loop

ID Management · Custom URLs · Scope

Scope of the short URL generator

Custom short links

Used / Unused ID tracking

Base58 → base-10 (decoding)

Deletion + expiry

Redis · In-Memory Cache

What Redis actually is

Sharding via hash slots

Redis Cluster · 16,384 Hash Slots

Persistence: RDB vs AOF

Failover via Sentinel

Cache code (cache-aside pattern)

PostgreSQL · Source of Truth

Schema

B-tree index lookup

PostgreSQL · B-tree Index Lookup

Read replicas via streaming replication

Kafka · Async Click Analytics

Partitioned log

Kafka · Topic 'clicks' · 3 partitions

Why not Redis pub/sub or a DB queue?

ClickHouse · Analytics OLAP

Reliability · What Goes Wrong at Scale

Hardware faults vs systematic faults

Three real failure modes

Reliability principles

Requirements compliance summary

Interview Follow-ups

Q1: How do you prevent collisions in Base58 codes?

Q2: 301 or 302 for redirects?

Q3: How do you handle 10× more traffic?

Q4: What if Kafka is down?

Q5: A user generates 100K URLs in a minute. What happens?

Q6: Support custom aliases like sho.rt/my-blog?

Q7: What is the trickiest failure mode?

Discussion

URL Shortener · System Design Deep Dive

Where This Is Actually Used

Real services running this exact design

Where short URLs actually show up

Architecture · Trace a Real Request

Simulate Real Traffic

Wire-Level Trace

Requirements

Functional

Non-Functional

Back-of-Envelope Estimation

Assumptions

System APIs

POST /shorten · create a short URL

GET /:url_key · redirect

DELETE /:url_key · remove a short URL

nginx · Edge Gateway

nginx limit_req · Per-IP Token Bucket

Production config

Node.js API · Base58 Encoding

Base58 Encoder · divmod loop

ID Management · Custom URLs · Scope

Q6: Support custom aliases like `sho.rt/my-blog`?

Q6: Support custom aliases like `sho.rt/my-blog`?