System Design
System Design Interview Questions
System design interviews test your ability to think through the architecture of large-scale distributed systems — the kind that serve millions of users. Unlike coding rounds, there is no single correct answer. Interviewers are evaluating your structured thinking, your awareness of trade-offs, and whether you can drive a conversation toward a coherent design.
The most common mistake candidates make is jumping straight into components before understanding requirements. Before drawing a single box, clarify scope, estimate scale, and define constraints. Everything flows from that.
Key Concepts
Requirements & Scope Clarification
Always start by asking questions. What are the functional requirements? What is the scale (requests per second, data volume, user count)? What are the non-functional requirements — availability, consistency, latency targets? Spending 3-5 minutes here prevents 20 minutes of designing the wrong system.
Capacity Estimation
Back-of-envelope calculations signal that you think about scale concretely. Estimate QPS (queries per second), storage requirements, and bandwidth. Rule of thumb: 1M DAU × 10 requests/day ≈ 100 QPS. A tweet-sized object (280 bytes + metadata) × 500M tweets/day ≈ 150 GB/day of write traffic.
Load Balancing & Horizontal Scaling
A single server has a ceiling. Load balancers distribute traffic across a fleet of stateless application servers. Horizontal scaling (more machines) beats vertical scaling (bigger machine) for availability and cost. Consistent hashing minimises cache misses and data movement when nodes are added or removed.
Databases: SQL vs NoSQL
SQL databases (Postgres, MySQL) give you ACID transactions and relational joins — ideal for complex queries and financial data. NoSQL (Cassandra, DynamoDB, MongoDB) sacrifice some consistency for horizontal write scalability and flexible schemas. The choice depends on your access patterns, not a blanket preference.
Caching
A cache sits in front of your database and absorbs read traffic. Redis and Memcached are the standard choices. Cache-aside (read from cache, miss → read DB, populate cache) is the most common pattern. Know the invalidation strategies: TTL, write-through, write-back. Cache stampede is a failure mode worth mentioning.
Message Queues & Async Processing
Decoupling producers from consumers with a queue (Kafka, SQS, RabbitMQ) improves resilience and lets you absorb traffic spikes. Use queues for work that doesn't need a synchronous response: sending emails, processing uploads, fan-out notifications. Kafka's log-based model also enables event sourcing and replay.
Sample Interview Questions
How would you design a URL shortener like bit.ly?
Answer: Core components: an API service that generates a short code (base-62 encode a counter or hash), a key-value store (short → long URL), a redirect service that does a DB/cache lookup and returns a 301/302. For scale: cache popular URLs in Redis, use a CDN for the redirect layer, partition the key-value store by short code prefix.
Why it matters: This is a common warm-up problem. It tests whether you can scope a simple system end-to-end and then reason about where it breaks under load.
What is the CAP theorem and what does it mean in practice?
Answer: A distributed system can guarantee at most two of: Consistency (all nodes see the same data), Availability (every request gets a response), Partition Tolerance (the system works despite network splits). Since network partitions are unavoidable, you choose between CP (consistent but may reject requests during a partition) and AP (available but may return stale data).
Why it matters: CAP sets up the trade-off conversation between SQL (CP) and eventual-consistency NoSQL (AP) stores. Interviewers want to see you apply this to a concrete design decision.
How does a CDN improve performance and what are its limits?
Answer: A CDN caches static assets (images, JS, CSS) at edge nodes geographically close to users, reducing latency and offloading origin traffic. Limits: CDNs only help with cacheable content; dynamic, personalised responses must still hit the origin. Cache invalidation across hundreds of edge nodes takes time, which matters for time-sensitive content.
Why it matters: Shows you understand caching at the infrastructure level and can identify where it doesn't apply.
Ready to test yourself?
Apply what you've read with a timed 10-question quiz on System Design.
Start System Design Quiz →