This article introduces distributed systems, the motivations behind their usage, and various aspects of scaling them.
Demystifying Distributed Systems
As per Andrew Tannenbaum, ‘A distributed system is a collection of independent computers that appears to its users as a single coherent system.’
Example: google.com is a distributed system because to a user it looks like one system, although it’s powered by huge clusters of computers.
Why Distribute a System?
Dealing with distributed systems is a pain! Compared to a single computer, they’re hard to deploy, maintain, debug, and even reason about. So why would one want to tread down this terrible path?
Always remember, systems are distributed by necessity, not by choice!
It would be wonderful to not have to deal with the complexities distributed systems bring. But today’s web-scale and Big Data needs have made them a necessary evil.
The biggest benefit of distributed systems is that they’re inherently more scalable than a single computer.
Now that we’ve talked about scalability, let’s discuss it in a little more detail.
Assume you’ve been given a task to increase the request handling capability of a database. How would you do it?
One possible way to do it is to upgrade the hardware it’s running on. This process is called scaling vertically, or scaling up.
The same problem can be solved by adding more computing nodes to the system. This is called scaling horizontally or scaling out.
Why, then, would one not scale only vertically, and avoid the hassles of a distributed system?
There are multiple factors which come into play here:
- There’s a technological limit to upgrading the hardware of one single computer. And even the latest hardware technologies are insufficient to serve the needs of technology companies with moderate to high workload.
- After a certain point, scaling horizontally proves to be more economical compared to its vertical counterpart.
- There’s no theoretical limit on how much a system can be scaled horizontally. It’s potentially infinitely scalable.
- Horizontal scaling makes a system more fault-tolerant, and therefore more available. A cluster of, say, ten machines is naturally more fault-tolerant than a single machine.
- Horizontal scaling can potentially make a system have low latency. At first sight, it might seem counter-intuitive. But consider this scenario: The time for a network packet to travel the world is physically bound by the speed of light.
For example, the shortest possible round-trip time for a request in a fibre-optic cable between New York & London is 90ms. With distributed systems you can have a node in both cities, permitting traffic to hit the node that is closest to it.
So this was a short introduction to Distributed Systems and how to scale them. We’ll see more advanced stuff on these subjects in upcoming articles. Stay tuned!