What is LOAD BALANCING? ⚖️

Gaurav Sen
21 Apr 201813:50

Summary

TLDRThe video script discusses consistent hashing, a technique crucial for scalable systems. It explains how hashing distributes requests evenly across servers, ensuring load balancing. The script highlights the problem of server addition, which can disrupt the distribution, and introduces consistent hashing as a solution to minimize the impact of such changes, maintaining system efficiency and cache utility.

Takeaways

  • 🔑 Consistent hashing is crucial for systems that need to scale to handle large volumes of requests efficiently.
  • 🖥️ A server's role is to process requests and send responses, which is essential for user satisfaction and business success.
  • 📈 When scaling up, load balancing becomes important to distribute the incoming requests evenly across multiple servers.
  • 🔄 Consistent hashing helps in distributing requests uniformly across servers without overloading any single server.
  • 🆔 The request ID, which is often generated randomly, is used to determine which server should handle a particular request by hashing.
  • 💡 If a server is added or removed, standard hashing can cause significant changes in the distribution of requests, leading to inefficiencies.
  • 🔄 Consistent hashing minimizes the impact of adding or removing servers by ensuring only a small subset of keys is remapped to different servers.
  • 📊 The 'pie chart' analogy illustrates how consistent hashing allows for a smooth transition when servers are added or removed, maintaining balance.
  • 🗃️ Caching user-specific information on the server that handles their requests can improve performance, but standard hashing can disrupt this by changing which server a user is directed to.
  • 🔑 Consistent hashing is designed to maintain a minimal change in the cache even when the system scales, preserving the efficiency of the system.
  • 🛠️ Understanding consistent hashing is vital for developers building scalable systems, as it addresses the challenges of traditional hashing methods in a distributed environment.

Q & A

  • What is consistent hashing?

    -Consistent hashing is a technique used in distributed systems to distribute a large number of keys across a set of servers in a way that minimizes the number of keys that need to be remapped when servers are added or removed.

  • Why is consistent hashing important in system design?

    -Consistent hashing is important in system design because it allows systems to scale effectively by evenly distributing load across multiple servers and minimizing reorganization when the number of servers changes.

  • What is the basic concept of hashing objects in the context of consistent hashing?

    -In the context of consistent hashing, hashing objects refers to the process of applying a hash function to a request ID or other data to determine which server should handle the request, ensuring an even distribution of load.

  • What is a server in the context of this script?

    -In this script, a server is a computer that runs a program and serves requests from clients, such as mobile devices, by processing the requests and sending back responses.

  • How does a server handle requests from clients?

    -A server handles requests from clients by receiving the requests, processing them using an algorithm, and then sending back the appropriate response, such as an image with a facial recognition result.

  • What is load balancing and why is it necessary?

    -Load balancing is the process of distributing incoming network traffic across multiple servers to ensure no single server becomes overwhelmed. It is necessary to maintain system performance and reliability as demand increases.

  • How does consistent hashing help with load balancing?

    -Consistent hashing helps with load balancing by assigning requests to servers in a way that minimizes the redistribution of requests when servers are added or removed, thus maintaining an even load across all servers.

  • What happens when a new server is added to a system using standard hashing?

    -When a new server is added to a system using standard hashing, a significant number of requests may need to be reassigned to different servers, causing a large shift in the load distribution and potentially invalidating cached data.

  • Why is it problematic to have a large number of changes in the assignment of requests to servers?

    -A large number of changes in the assignment of requests to servers can be problematic because it can lead to a significant amount of data in the cache becoming obsolete, requiring reprocessing and negatively impacting performance.

  • How does consistent hashing minimize the impact of adding a new server?

    -Consistent hashing minimizes the impact of adding a new server by ensuring that only a small subset of keys (requests) need to be reassigned, maintaining the overall balance of the load and reducing the need to update the cache.

  • What is the significance of the 'pie chart' analogy used in the script?

    -The 'pie chart' analogy in the script is used to visually represent the distribution of requests among servers. It helps to illustrate how consistent hashing allows for minimal changes in the distribution when servers are added or removed.

Outlines

00:00

🔑 Introduction to Consistent Hashing

This paragraph introduces the concept of consistent hashing, which is crucial for building scalable systems. It explains that as a system grows, the need to distribute requests evenly across multiple servers arises. The speaker uses the analogy of a single computer serving requests to illustrate the concept of a server and load balancing. The paragraph also touches on the challenges of scaling, such as distributing requests when additional servers are added, and introduces the idea of using a request ID to hash and evenly distribute the load.

05:01

📚 The Mechanics of Hashing and Load Distribution

The second paragraph delves into the mechanics of how hashing works for load distribution. It describes the process of hashing a request ID to determine which server should handle the request. The paragraph provides examples of how different request IDs are hashed and mapped to servers, emphasizing the uniform distribution of load across servers. However, it also highlights the issue that arises when new servers are added to the system, causing a significant redistribution of requests and potentially disrupting the balance of load.

10:05

🔄 Challenges of Scalability and Consistent Hashing

The final paragraph discusses the challenges faced when scaling systems that use standard hashing techniques. It points out that adding new servers can lead to a complete redistribution of requests, which can be problematic if the request IDs are not truly random and instead encapsulate user information. The speaker explains how this can lead to the loss of valuable cached information when users are reassigned to different servers. The paragraph concludes by emphasizing the need for consistent hashing, which aims to minimize the change in the distribution of requests when servers are added or removed from the system.

Mindmap

Keywords

💡Consistent Hashing

Consistent hashing is a technique used in distributed systems to distribute a set of items (such as requests) across a set of points in a way that minimizes the remapping of items as the system scales or changes. In the video, it is the main concept discussed for load balancing in systems that can scale significantly. The script explains how consistent hashing helps evenly distribute requests across servers, ensuring that adding or removing servers does not disrupt the existing load distribution significantly.

💡Hashing

Hashing refers to a process that maps data of arbitrary size to fixed-size values. In the context of the video, hashing is used to map request IDs to servers. The script mentions that hashing objects is a fundamental concept for understanding consistent hashing, and it is used to determine which server should handle a particular request by hashing the request ID and using the result to select a server.

💡Load Balancing

Load balancing is the distribution of workload across multiple resources to ensure no single resource bears too much demand, which can lead to performance degradation. In the video, load balancing is crucial for distributing incoming requests evenly across servers. The script uses the example of adding a new server to handle increased request volume and how consistent hashing can help in achieving this without causing major disruptions.

💡Server

A server, in the context of the video, is a system that provides services or data to other systems or users. The script discusses servers in the context of handling requests from clients, such as mobile users, and how consistent hashing can be used to direct these requests to the appropriate server to maintain balance and efficiency.

💡Request ID

Request ID is a unique identifier for each request sent to a server. In the video, the script explains that each request has a Request ID, which is used in the hashing process to determine which server should handle that request. The Request ID is assumed to be uniformly random, which helps in evenly distributing the load across servers.

💡Algorithm

An algorithm, in the video, refers to a set of rules or procedures for solving problems or performing tasks. The script uses the example of a facial recognition algorithm that runs on a server to process requests and return results to clients. The efficiency and scalability of such algorithms are important when discussing consistent hashing and load balancing.

💡Scale

Scaling, in the context of the video, refers to the ability of a system to handle an increasing amount of work, or its capacity to expand to accommodate growth. The script discusses the importance of consistent hashing for systems that need to scale to a large extent, emphasizing its role in maintaining performance as the system grows.

💡Hash Function

A hash function is a mathematical process that converts an input (or 'message') into a fixed-size string of characters, which is typically used for indexing data in a hash table. In the video, the hash function is used to map request IDs to servers. The script explains how the hash function works in conjunction with consistent hashing to distribute requests and how changes in the hash function can affect server load.

💡Uniformly Random

Uniformly random, in the video, refers to the property of a distribution where all outcomes are equally likely. The script mentions that the Request ID is expected to be uniformly random, which is important for the consistent hashing process to ensure a fair distribution of requests across servers.

💡Cache

Caching is the storage of copies of data, usually in a faster storage medium, to reduce the need to repeatedly access the original data source. In the video, the script discusses the use of cache in servers to store relevant information for users, such as profile data, to improve response times. However, it also points out the challenges of maintaining cache effectiveness when the system scales and consistent hashing causes users to be redirected to different servers.

💡Viral

Viral, in the context of the video, refers to the rapid spread of popularity or demand for a service or product, often through word of mouth or social sharing. The script uses the term to describe a scenario where the demand for the server's algorithm increases significantly, necessitating the addition of more servers and the use of consistent hashing to manage the increased load.

Highlights

Introduction to consistent hashing and its relevance to scalable systems.

Explanation of hashing objects and the concept of scaling in system architecture.

The technical challenge of balancing load across multiple servers in a scalable system.

The role of a server in processing requests and the importance of load balancing.

The use of request IDs and their uniform randomness in distributing load.

Hashing request IDs to map them to specific servers for load distribution.

The impact of adding new servers on the distribution of requests and existing load.

The problem of significant changes in server load when scaling with traditional hashing.

The illustration of load distribution using a pie chart to represent server capacity.

The issue of cache invalidation when servers are added or removed in a traditional hashing setup.

The importance of minimizing the change in the range of numbers served by each server.

The introduction of consistent hashing as a solution to the problems of traditional hashing in scalable systems.

Consistent hashing's ability to minimize the overall change in load distribution when scaling.

The practical application of consistent hashing in systems that require high scalability and reliability.

The theoretical underpinnings of consistent hashing and its advantages over traditional methods.

The potential for consistent hashing to improve system performance and user experience in large-scale applications.

Transcripts

play00:00

Hi everyone. This is GKCS. We are talking about consistent hashing now.

play00:05

So this is, as you can expect,

play00:10

relevant to the concept of hashing objects.

play00:15

So consistent hashing has certain properties,

play00:18

and this is something that you need to know if you are building systems,

play00:22

if you're building systems which can scale to a large extent.

play00:27

So many of these terms that I'm stating right now might be, you know,

play00:31

jargon For you, you might be saying, what do you mean scale,

play00:33

but what do you mean systems? So if you have, let us say

play00:40

a box, just a, just a computer which is running a program, right?

play00:45

And someone your comes to you and says that, you know what?

play00:49

I really like your algorithm.

play00:51

I'll pay you money just to use your algorithm.

play00:57

Okay? So there's this person over here

play01:00

who has their cell phone and they can connect to your box, your computer,

play01:06

they wanna use your algorithm, get the result, and every time they do that,

play01:11

they pay you some money or maybe at the end of the month or whatever.

play01:14

But keeping the money better aside, keeping the sales, marketing,

play01:17

everything aside, we have technical specifications, right?

play01:21

This algorithm needs to run for the customer to be happy and you

play01:26

to be happy in turn.

play01:29

So when I say that this is an algorithm running on

play01:34

a computer, this is like a server,

play01:39

right? That's what it means. A server is something which serves requests.

play01:44

And when that person is connecting through the mobile,

play01:46

they're technically sending a request. Okay?

play01:52

You, they send a request, your algorithm runs,

play01:56

you understand what they want.

play01:58

Let's say the algorithm is a facial recognition algorithm,

play02:00

which gives them mustache, right?

play02:03

So your response is going to be having that image sent back.

play02:09

So your server takes requests, sends backs, response.

play02:14

Now let's say one person comes to you and they're happy, they're really,

play02:19

really happy.

play02:20

So they tell all their friends and you have thousands and thousands of requests

play02:25

coming in. Your computer cannot handle this anymore.

play02:28

So what you do is you add another computer because you know

play02:33

you're getting a lot of money from all these people.

play02:35

Now you can afford to buy a new computer, a new server.

play02:40

If you can do this, there's one problem.

play02:44

Where do you send the requests? Like if there's a second person,

play02:50

then should the request go here or should the request go here?

play02:57

At the bare minimum level, if you have, let's say, end servers,

play03:04

you want in general to balance the load on all of these servers.

play03:09

Now you can imagine all of these servers actually carrying load these requests

play03:14

are things which they need to process.

play03:16

So the server has load

play03:21

On it. This is pretty important,

play03:25

right?

play03:26

And the concept of actually taking end servers and trying to balance the load

play03:31

evenly on all of them is called

play03:35

load balancing. Okay?

play03:41

So this is our very simple problem.

play03:43

What we are going to try to do is take these requests and

play03:48

evenly balance the load on all of our end servers.

play03:52

And the concept of consistent hashing will help us do that.

play03:58

So.

play04:00

You want to evenly distribute the weight across all servers.

play04:04

You'll get a request ID in each request, right? So that request

play04:10

ID is, you can expect uniformly random.

play04:14

So when the client says, when the, when the mobile actually sends you a request,

play04:19

it randomly generates a number from zero to M

play04:24

minus one.

play04:27

And what happens is this request ID is sent to your server. What you can then do

play04:33

is take this request id, I'll call it I,

play04:37

or I'll call it r

play04:41

r one and hash it.

play04:45

When you hash it, you get a particular number. Let's say M one,

play04:51

this number can be mapped to a particular server. How?

play04:55

Because you have N servers,

play04:57

you take the remainder within whatever index you get here,

play05:01

you just send that to the respective server,

play05:05

right? So let's say you have four servers, S zero,

play05:10

SS one, SS two, and SS three

play05:14

R one is 10. When you pass it through your hash function,

play05:18

you get the value three, three mod

play05:23

four, and it's four In our case is three.

play05:27

So this will go to bucket number.

play05:32

This request, R one will go to the server three,

play05:38

okay? Another example, H of 20.

play05:42

Let's say this somehow gives you 15

play05:45

and mod four. Again, this gives you three.

play05:51

So R two also goes to three.

play05:55

And.

play05:56

Finally, if we have

play05:59

edge of 35

play06:04

when hashed gives you 12,

play06:11

this mod four gives you zero.

play06:14

So R three maps to the first server, right? And in general,

play06:18

because this is uniformly random and your hash function is uniformly random,

play06:23

you can expect all of the servers to have uniform load.

play06:27

So each of the servers, if there are X requests, will have X by end load.

play06:33

And the load factor is one by end.

play06:37

So everything's perfect and that's all we need to do.

play06:42

Hmm, except

play06:45

what happens if you need to add more servers?

play06:48

We said that initially our clients are very happy with us.

play06:51

So we are getting viral or for some reason, you know, people are really,

play06:55

really hitting our servers a lot. What happens if that happens?

play07:00

We need to add more servers, S four.

play07:06

But now there's a problem.

play07:08

The requests which are being served here are completely bamboozled.

play07:13

The request id. So when you pass 'em through the hash functions, the values,

play07:16

you are getting four three mod four, this has to change.

play07:22

Now you have in total five servers. So three mod five

play07:27

R one has to go to Server three. So that's okay.

play07:33

What about 15 mod five, this has to change.

play07:37

It has to go to

play07:41

server zero because this is zero, right?

play07:46

What about this 12 mod five,

play07:51

this is equal to two. So this request, R three again, has to change.

play07:57

And.

play07:58

Go to S two.

play08:01

And this is very clearly illustrated in a pie diagram,

play08:08

right? This is your pie. Initially you had four servers, so

play08:12

the pie was 25% for every person,

play08:17

right?

play08:19

So this is having 25, 25, 25, 25.

play08:23

So they have 25 numbers assigned to them. Now, when the fifth server came in,

play08:29

this had to lose five of its buckets.

play08:34

So there was a change of five buckets.

play08:38

This had to take these five buckets. So that is plus five

play08:45

go up to 40, right? So instead of 50, it goes up to 40 only.

play08:49

So this lost 10 buckets. So that's 10 buckets changed.

play08:55

When I mean buckets, I just mean numbers.

play08:57

So those are just numbers which it lost.

play09:03

So this is up to 14.

play09:04

Now this server S three or rather S two,

play09:08

so this is SS zero. SS one S two

play09:14

has to take these 10. So that's plus 10 as a change.

play09:19

Plus it goes only up to 16 now.

play09:23

So instead of 75, it goes only up to 16.

play09:26

So that's 15 plus 15 buckets changed.

play09:31

Now S two is done. We go to SS three,

play09:35

which is the last server that we had.

play09:38

And this now has to go only up to 80.

play09:43

So what happens is it lost.

play09:46

So it has five buckets remaining in its old section.

play09:49

It lost 20 buckets here and it had to take

play09:53

15 buckets from here to have a total of 20 in SS three.

play10:00

Okay? So that is 15 buckets added to it

play10:05

and 20 buckets lost. And finally,

play10:10

SS four has to actually take 20 buckets. So the cost of the operations,

play10:15

if you see the cost of the change in this is five plus

play10:19

5,

play10:26

10, 20, 30, 40, 50, 60, 70, 80, 9000. So in total the change was 100,

play10:31

which is actually your entire sort space.

play10:37

Now why is this such a big deal? You know? Okay,

play10:41

100 changes, the whole thing changed. So what, well, because in practice

play10:47

the request ID is never random or it is rarely random.

play10:52

The request ID usually encapsulates some of the information of the user.

play10:56

For example, the user id. So if I hash got up

play11:03

this hash is going to give me the same result again and again and again because

play11:08

the hash function is constant. If I'm getting the same result,

play11:12

it means that when I model A with N,

play11:15

it's going to send me to the same server again and again and again. Now,

play11:18

why not use that information if I'm being sent to the same server again?

play11:22

And if there is something like fetching a profile

play11:27

from a database, why should I do that all the time?

play11:31

Why not store it in the local cache, right?

play11:35

That seems like a smart thing to do depending on the user id.

play11:39

So instead of R one, I'll, I can call it U one, but I'll still call it R one.

play11:43

So depending on the user id,

play11:46

we can send people to specific servers and once they're sent there,

play11:51

you can store relevant information in the cache for those servers.

play11:57

But through this policy, what's going to happen is the entire system changes.

play12:01

All users are now Azar sent to different places and all the useful cache

play12:06

information you had is just dumped. It's useless.

play12:10

Almost all of it is useless now because the numbers

play12:15

that you are serving completely changed. So what you want to avoid

play12:21

is a huge change in the range of numbers that you're serving.

play12:25

If you're serving from this to this,

play12:28

then a small change is what you want here. If you're serving from here to here,

play12:33

then you don't want a huge change here. What you want is a tiny change here.

play12:37

So in a new pie chart,

play12:38

what's going to happen is if this is the pie chart and this is the

play12:43

quarter thing,

play12:45

what I would like to do is take a little from this first

play12:49

server, so that is, that is this bit, take a little from the second server,

play12:55

take a little from the third server,

play12:58

and take a little from the fourth server such that the,

play13:02

the sum of these areas is going to be 20%, right?

play13:07

Because you added one server, you have five servers, you want 20% on each.

play13:10

So the sum of these areas should be 20%,

play13:13

but the overall change should be minimum.

play13:17

That's the general idea why the old standard way of hashing doesn't

play13:22

work. In this case, we need more advanced approaches,

play13:26

and that's where comes consistent hashing comes in.

play13:34

The topic that.

play13:35

We are gonna be talking about today is relevant to hashing and it's called

play13:39

consistent hashing.

Rate This

5.0 / 5 (0 votes)

Related Tags
Consistent HashingLoad BalancingScalabilityDistributed SystemsHashing AlgorithmsSystem DesignTechnical JargonServer ManagementAlgorithm EfficiencyRequest Distribution