Load balancing distributes incoming requests across multiple servers — enabling horizontal scaling, improving availability, and preventing any single server from being overwhelmed. It's a fundamental component of scalable, reliable systems.
What a load balancer does
A LOAD BALANCER sits in front of multiple servers and distributes requests among them:
Client → LOAD BALANCER → ┬→ Server 1
├→ Server 2
└→ Server 3
→ spreads load → no single server is overwhelmed (enables HORIZONTAL scaling)
→ routes around FAILED servers (health checks) → high AVAILABILITY
→ a single entry point for clients
Why load balancing matters
✓ SCALABILITY → distributes load across many servers → handle more traffic by adding servers
✓ AVAILABILITY → if a server fails, route to healthy ones → no single point of failure
✓ PERFORMANCE → prevents any server from being overloaded → consistent response times
✓ Enables ZERO-DOWNTIME deploys, maintenance (take servers out of rotation)
→ a foundation of scalable, highly-available systems.
