Load Balancing: Why It Exists and When You Need It. (1/n)
Definition: Load balancing is the process of distributing incoming workload across multiple servers to improve performance, reliability, and scalability.

Done with textbook definition, let’s deep dive:
Let say you initially have only one server serving your backend service, there will be no problem until you have couple of users (1-1000) but once you have thousands of them your server becomes slow and becomes overwhelmed with the amount of request it receives.
One possible solution is vertical scaling
Vertical Scaling
But vertical scaling has few disadvantages:
Hardware limits: there’s a maximum size you can scale to
Single point of failure: if the machine goes down, everything goes down
Often more expensive at higher levels
May require downtime during upgrades
Among which mainly single point of failure is what we are trying to avoid using Horizontal Scaling.
Horizontal Scaling
Now we need to distribute requests among them, that’s where load balancing comes into the picture. Now go back and read the definition and it makes sense now.
Let’s take an example, assume you have 3 servers running and serving login service to the users, now the role of load balancer is to distribute all the incoming traffic among these 3 servers evenly(ideally) so that none of the server is overwhelmed and this removes the backend single point of failure. If one server goes down, traffic can be redirected to healthy instances (assuming the load balancer itself is highly available).
Breaking the misconception:
Algorithms to implement load balancing
There are multiple Core Load Balancing Algorithms such as Round Robin, weighted round robin, based on least connections and response time, hashed based and consistent hashing (most important), which we can deep dive into in the upcoming blogs
Flip side of the coin
There can be few problems arise with the implementation of load balancer,
Added System Complexity
Single Point of Failure
Increased Latency
Session Management Problems
Load Balancer as SPOF (Single Point of Failure)
What if load balancer itself dies? Solution: Redundancy
We can have copy of load balancer as standby which will serve if the current ones break down, active load balancer coupled with multiple standby of them.
My taught on its worth despite its shortfalls
Load balancing trades a small increase in complexity and latency for significantly higher availability, fault tolerance, and scalability. In most production systems, downtime is far more costly than the overhead introduced
In short:
If availability matters → load balancing is mandatory.
If availability doesn’t matter → don’t over engineer.
Thank you.
Upcoming
In the upcoming blog we will deep dive into
Implementing a basic load balancer simulation
Getting in depth with the load balancing algorithms
Concept of Rate limiting