What are Distributed CACHES and how do they manage DATA CONSISTENCY?

Gaurav Sen
31 Mar 201913:29

Summary

TLDRThis video script delves into the concept of caching in system design, explaining its purpose and benefits. It highlights two main use cases: reducing database load and saving network calls for frequently accessed data. The script also covers cache policies like LRU, the challenges of cache eviction and data consistency, and the strategic placement of caches. It concludes by discussing the importance of caching in system design and invites viewers to engage with the content.

Takeaways

  • 😀 Caching is a technique used in system design to store data temporarily to improve response times and reduce database load.
  • 🔍 The primary use cases for caching are to save on network calls and to avoid redundant computations, both aimed at speeding up client responses.
  • 💡 A cache works by storing key-value pairs, where the key represents a unique identifier for the data, and the value is the data itself.
  • 🚀 Caching is beneficial when dealing with commonly accessed data, such as user profiles, to reduce database queries and network traffic.
  • 🔑 The decision to load data into the cache and when to evict data from it is governed by cache policies, which are crucial for cache performance.
  • 📚 Least Recently Used (LRU) is a popular cache policy that evicts the least recently accessed items first when the cache is full.
  • 🔄 Cache size is a critical factor; too large a cache can lead to increased search times, making it less effective and potentially counterproductive.
  • 💡 Predicting future requests is essential for effective caching, as it helps in determining which data should be stored in the cache for quick access.
  • 🔒 Consistency between the cache and the database is vital, especially when updates are made to ensure that clients receive the most current data.
  • 📍 The placement of the cache can vary, with options including local in-memory caches on servers or a global cache that multiple servers can access.
  • 🔄 Write-through and write-back are caching mechanisms for handling updates to cached data, with each having its advantages and disadvantages in terms of performance and consistency.
  • 🌐 Redis is an example of a distributed cache that can be used as an in-memory data structure store, suitable for scenarios requiring fast data access and manipulation.

Q & A

  • What is a cache and why is it used in system design?

    -A cache is a storage layer that temporarily holds data for faster access. It is used in system design to improve performance by reducing the need to fetch data from a slower data source, such as a database, repeatedly.

  • What are the two main scenarios where caching is beneficial?

    -Caching is beneficial in two main scenarios: 1) When querying for commonly used data to save network calls, and 2) When avoiding computations, such as calculating the average age of all users, to reduce load on the database.

  • How does a cache help in reducing network calls?

    -A cache helps in reducing network calls by storing frequently requested data, such as user profiles. When a user requests their profile, the system can retrieve it from the cache instead of making a new request to the database.

  • What is the purpose of caching when it comes to avoiding computations?

    -Caching is used to avoid computations by storing the results of expensive operations, like calculating averages, in the cache. This way, when the same computation is requested again, the result can be served directly from the cache without re-computing it.

  • Why is it not advisable to store everything in the cache?

    -Storing everything in the cache is not advisable because cache hardware, typically SSDs, is more expensive than regular database storage. Additionally, a large cache can lead to increased search times, which can negate the performance benefits of caching.

  • What are the two key decisions involved in cache management?

    -The two key decisions in cache management are when to load data into the cache (cache entry) and when to evict data from the cache (cache exit). These decisions are governed by a cache policy.

  • What is a cache policy and why is it important?

    -A cache policy is a set of rules that determine how and when to load and evict data in the cache. It is important because the performance of the cache, and thus the system, largely depends on the effectiveness of the cache policy.

  • What is the Least Recently Used (LRU) cache policy and how does it work?

    -The Least Recently Used (LRU) cache policy is a popular cache eviction strategy where the least recently accessed items are removed first. It works by keeping recently accessed items at the top of the cache and evicting items from the bottom when the cache reaches capacity.

  • What is thrashing in the context of caching?

    -Thrashing in caching refers to a situation where the cache is constantly being updated with new data without ever serving any requests. This can occur if the cache is too small and cannot hold the data needed between requests, leading to inefficient use of resources.

  • Why is data consistency important when using a cache?

    -Data consistency is important to ensure that the data served from the cache is up-to-date and accurate. Inconsistencies can occur if updates to the database are not reflected in the cache, leading to outdated or incorrect information being served to users.

  • What are the two main strategies for handling cache updates and why are they used?

    -The two main strategies for handling cache updates are write-through and write-back. Write-through ensures data consistency by updating the cache and the database simultaneously, while write-back improves performance by updating the cache immediately and the database later or in bulk, but it can lead to data inconsistency if not managed properly.

  • What is Redis and how is it used in caching?

    -Redis is an in-memory data structure store that can be used as a database, cache, and message broker. It is used in caching to provide fast data retrieval and storage, and it supports various data structures, making it suitable for a wide range of caching use cases.

  • What are the advantages of placing a cache close to the servers?

    -Placing a cache close to the servers can reduce network latency and improve response times, as data can be served directly from the cache without the need to access the database. It also simplifies the implementation of the caching layer.

  • What is the concept of a global cache and what are its benefits?

    -A global cache is a centralized cache that is accessible to all servers in a distributed system. The benefits of a global cache include consistent data access across all servers, reduced load on the database due to shared cache usage, and the ability to scale the cache independently from the application servers.

Outlines

00:00

🔍 Introduction to Caching Concepts and Use Cases

This paragraph introduces the concept of caching and its importance in system design. It explains caching as a method to store data temporarily to reduce database load and speed up response times. Two main use cases are highlighted: saving network calls by storing commonly accessed data like user profiles, and avoiding computations like finding the average age of all users. The paragraph emphasizes the benefits of caching, such as reduced load on the database and faster client responses, but also notes the limitations, including the higher cost of cache hardware and the potential for decreased performance if the cache is too large or not properly managed.

05:00

📚 Understanding Cache Policies and Consistency

This section delves into the complexities of cache management, focusing on when to load and evict data, which is determined by cache policies. It mentions the Least Recently Used (LRU) policy as a common strategy for cache eviction. The paragraph also addresses the potential problems of poor cache eviction policies, such as thrashing, where the cache is constantly updated without benefiting from its content. Additionally, it raises concerns about data consistency, especially in scenarios where updates to the database are not reflected in the cache, leading to outdated information being served to clients. The placement of the cache, either close to the servers or the database, is discussed, along with the trade-offs of each approach.

10:01

🔧 Cache Placement, Resilience, and Write Policies

The final paragraph discusses the placement of caches, advocating for a global cache for its resilience and consistency, especially in distributed systems. It contrasts the benefits of local memory caches for speed and simplicity with the advantages of a global cache for data integrity and independent scaling. The paragraph also explores the different write policies: write-through, which ensures data consistency but can be problematic with multiple cache instances, and write-back, which can lead to performance issues due to the need for frequent updates. A hybrid approach is suggested to balance the trade-offs, where non-critical data can be updated in the cache first and then batch-updated in the database to reduce database hits.

Mindmap

Keywords

💡Caching

Caching is the process of storing frequently accessed data in a temporary storage area, known as a cache, to improve the performance of systems by reducing the time it takes to access that data. In the video, caching is central to the theme as it discusses how to use caching to enhance system design by speeding up responses to client requests and reducing database load.

💡Cache

A cache is a high-speed data storage layer that is closer to the processor than the main database. It is used to store copies of data that are frequently accessed. In the script, the cache is explained as a box that can store data like user profiles or calculated values, such as the average age of users, to avoid redundant database queries.

💡Database

A database is an organized collection of data, typically stored and accessed electronically. In the video, the database is the primary source of data from which the server retrieves information. The script discusses scenarios where caching can reduce the load on the database by storing frequently accessed data.

💡User Request

A user request is an action made by a user to retrieve or send data to a server. In the context of the video, user requests are the triggers for data retrieval operations, and the script explains how caching can be used to quickly fulfill these requests by storing data that is frequently requested.

💡Network Calls

Network calls refer to the communication between different parts of a system, such as between a server and a database. The video script mentions that caching can save network calls by storing data locally, thus reducing the need to repeatedly access the database for common queries.

💡Competitions

In the script, 'competitions' seems to be a typographical error for 'computations.' Computations refer to the processes of calculation or processing data. The video explains how caching can be used to avoid expensive computations, such as calculating the average age of all users, by storing the result in the cache after the first computation.

💡Cache Policy

A cache policy is a set of rules that determine which data should be loaded into the cache and when data should be evicted from the cache. The script discusses the importance of cache policies in determining the performance of a cache, with examples including the Least Recently Used (LRU) policy.

💡LRU (Least Recently Used)

LRU is a cache replacement policy that discards the least recently used items first. The video script uses LRU as an example of a popular cache policy, explaining how it helps maintain a cache that contains the most relevant data according to recent user requests.

💡Eviction Policy

An eviction policy determines how and when data is removed from the cache to make space for new data. In the video, the eviction policy is a critical component of cache management, and the script discusses the consequences of a poor eviction policy, such as increased load on the database.

💡Consistency

Consistency in the context of caching refers to the alignment of data between the cache and the database, ensuring that the data served from the cache is up-to-date. The script addresses the issue of maintaining data consistency as a challenge when using caches, especially in scenarios where updates to the database are not immediately reflected in the cache.

💡Write-Through Cache

A write-through cache is a type of cache that updates data in both the cache and the underlying storage when a modification is made. The video script explains the write-through approach and its implications for data consistency, noting potential issues when multiple servers are involved.

💡Write-Back Cache

A write-back cache, also known as write-behind, is a cache that updates data in the cache first and then asynchronously updates the underlying storage. The script discusses the trade-offs of using a write-back cache, including potential performance benefits and consistency challenges.

💡Hybrid Cache

A hybrid cache combines the characteristics of write-through and write-back caches to optimize performance and consistency. The video script suggests using a hybrid approach to caching as a way to balance the advantages and disadvantages of both methods, depending on the criticality of the data being cached.

Highlights

Caching is introduced as a method to improve system design by enhancing response times and reducing database load.

A cache is defined as a temporary storage used to store data that can be quickly retrieved, such as user profiles.

Two primary use cases for caching are identified: saving network calls and avoiding redundant computations.

The concept of a key-value pair is explained as the fundamental unit of data stored in a cache.

The importance of cache for reducing load on databases, especially in systems with high server traffic, is discussed.

The limitations of caching, including the higher cost of cache hardware and the potential for increased search times with large data volumes, are highlighted.

A cache policy is described as a critical factor in determining the performance of a cache system.

Least Recently Used (LRU) is presented as a popular cache policy that evicts the least recently accessed data first.

The potential issues with poor eviction policies, such as increased database load and the inefficiency of cache usage, are explained.

The phenomenon of thrashing is introduced as a problem where constant cache misses lead to excessive database queries.

Data consistency in caching is identified as a crucial issue, especially when updates to data are not reflected in the cache.

The placement of cache is discussed, with options ranging from local in-memory caches to global distributed caches.

Redis is mentioned as an example of a distributed cache system that can be used for caching in large-scale applications.

The trade-offs between write-through and write-back caching strategies in terms of consistency and performance are examined.

A hybrid caching approach is suggested to balance the benefits of write-through and write-back methods.

The video concludes by emphasizing the extensive use of caching in real-world system design and invites viewer engagement through comments and subscriptions.

Transcripts

play00:05

Hi everyone.

play00:06

This video will be talking about what caching is and how we can use it to design

play00:10

systems. So at the end of this video,

play00:12

you should be able to tell what a cache is and the situations where a cache can

play00:16

be used, and of course how you'll be using it. So what is a cache?

play00:20

Imagine you have this server and you have a database from which you pull out

play00:24

results. So whenever there's a user request to your server,

play00:27

you actually pull out results from the database.

play00:30

There are two scenarios where you might want to use a cache.

play00:32

The first use case is when you're querying for some commonly used data.

play00:35

For example, you have your profile and that is stored in this database.

play00:41

The same user is asking for the profile multiple times and had you saved it in

play00:45

this box, then you would've saved a lot of competition. So within this box,

play00:49

maybe you can keep a cache and in memory cache, which is in your memory,

play00:53

it stores the profile per user. So you have a key and a value.

play00:56

And so that is the first reason why you want a cache to save network calls.

play01:02

The second use case is when you want to be avoiding competitions, for example,

play01:05

you want to find the average of all users the average age of all users that you

play01:10

have in your system. In this case,

play01:11

one of the ways that you can do this is every time a person asks for you for the

play01:14

average age, you go to the database, get all the users,

play01:17

find the average of the age, and then return that. Now,

play01:20

this is very expensive of course.

play01:21

So instead what you can do is you can do this once, find the average age,

play01:24

and then store it in the cache. So that's a key and a value.

play01:26

So average and then 26, 28, whatever be the thing. It's a key value.

play01:31

Store it in cache the finance and value.

play01:33

Where you want to be using a cache is when you want to avoid load on the

play01:36

database.

play01:37

So if you have a lot of servers and they're all hitting the database to get

play01:40

information, it's going to be putting a lot of load. Instead,

play01:43

you can get some caches, maybe two or three distributed system, you know,

play01:46

after all. And then you can keep that information in the cache.

play01:49

You can hit the caches, avoid hitting the database.

play01:55

But the first two points are the key ideas of a cache,

play01:57

which is either avoiding a network call or avoiding competitions.

play02:01

And both of them are designed to help you speed up responses to your

play02:06

clients.

play02:06

So a client makes a call and you immediately give a response if you have it in

play02:11

cache instead of making a query to the database. So now that you know,

play02:15

the benefits of a cache or first instinct might be to just store everything in

play02:18

the cache,

play02:19

like cram it up with all the information because the response times are much

play02:22

faster. When you're pulling data from the cache,

play02:24

you can't do that for multiple reasons. Firstly,

play02:27

the hardware on which a cache runs is usually much more expensive than that of

play02:31

an normal database. The reason being that this cache usually runs on SSDs,

play02:35

which are expensive compared to commodity hardware that a database runs on.

play02:39

The second thing is that if you store a ton of data in the cache,

play02:44

then the search times will increase. And as the search times keep on increasing,

play02:47

it makes less and lesser sense to use the cache.

play02:49

Why don't you just go and hit the database? Yeah.

play02:52

So you'll be saving on a network call, yes,

play02:54

but ton of data on the cache is not just expensive,

play02:57

but also counterproductive.

play03:00

So our problem becomes story information in cache such that the database has

play03:04

almost finite information while the cache needs to have the most relevant

play03:08

information according to the requests which are going to come in future.

play03:11

So we need to predict something. To predict things. We need to ask ourselves.

play03:15

Just two important questions. When do I make an entry in the cache,

play03:20

which is when do I load data into the cache and when do I evict data outta the

play03:24

cache?

play03:24

The way in which you decide for loading or evicting data is called a policy.

play03:30

And because this video is on caching, this is called a cache policy.

play03:33

So a cache performance almost entirely depends on your cash policy.

play03:37

There's multiple policies you can have.

play03:39

L R U is probably the most popular least recently used.

play03:43

What this says is that whenever you're pulling in entries to the cash,

play03:46

keep it on the top. And whenever you need to kick out,

play03:49

people kick out the bottom. Most entries,

play03:51

which means the least recently used entries in the cache.

play03:54

So the top entries are going to be maybe seconds ago very few seconds ago.

play03:59

And then you keep going down the list minutes ago, hours ago, years ago.

play04:03

And then you kick out the last entries when you do need to, okay? For example,

play04:07

if there's a celebrity who's made a post,

play04:09

made a comment and everyone wants to pull that comment.

play04:12

So you keep that on top of the cache. And while it is hot,

play04:15

while everyone wants to read that comment, it's on top of the cache.

play04:18

It's performing really well. And when it gets cooler,

play04:20

when people stop asking for that comment,

play04:22

it keeps getting pushed to the end of the cache and then it's kicked out.

play04:26

Alright? So that's the L L U policy.

play04:28

The L F U policy leased frequently used is not frequently used in the real

play04:33

world, so you can read on it. Of course, there's a description below.

play04:36

And some policies have started to perform really good.

play04:40

I mean even better than L R U. So they're sliding window based policies.

play04:44

One of the Google developers who made caffeine has implemented this in their

play04:48

caches. Now imagine you have a really poor eviction policy,

play04:50

which means that hitting the cache is of no use. I mean,

play04:52

you ask the cache for some data, it always responds with I don't have it.

play04:56

And then you have to go to the database. Again, the first problem over here,

play05:00

which is with a poor eviction policy,

play05:02

is that it's actually harmful to have a cache because you are making that call,

play05:05

that extra call, which is completely useless if you have a poor eviction policy.

play05:10

So the second problem can occur if you have a very small cache.

play05:15

So users X and Y need their profiles, and this is a profile cache.

play05:19

So X asks for their profile, you get it from the database,

play05:22

you put X in this cache, and let's say it can just store one entry y,

play05:26

ask for their profile. You have a cache miss. You go and hit the database,

play05:31

get that entry, populate it in the cache, and now X makes another call,

play05:35

which means they need X. So there's a problem here.

play05:38

So this concept is called thrashing,

play05:43

where you are constantly inputting and outputting into the cache without ever

play05:48

using the results. So it's again, hurting you more than helping you.

play05:53

The final and most important problem,

play05:54

I'm sure you guys have already realized is that what about consistency?

play05:59

So if a different server

play06:02

makes a call to the database for an update and says, update profile X.

play06:07

Yeah so there's an update ratio, but the cache doesn't have that entry updated.

play06:12

So server one pulls from the cache,

play06:16

gets the user profile for x outdated profile, and then serves it back.

play06:22

So maybe the user has changed the password and you have the old entry,

play06:26

the old password is still working.

play06:28

Maybe there's a hacker who's actually hacked the old password.

play06:30

This becomes an issue of data consistency. Now,

play06:33

let's try to figure out where to place the cache.

play06:35

The cache can be placed close to the database or can be placed close to the

play06:39

servers. There's benefits for both and drawbacks for both.

play06:42

If you place the cache close to the servers, how close can you place it?

play06:46

You can place it in memory itself. Yeah,

play06:48

so you can put the cache in memory in the server.

play06:54

If you do this,

play06:55

what's going to happen is that the amount of memory in your server is going to

play06:58

be used up by your cash. But if,

play07:01

if the number of results that you're working with is really small and you need

play07:04

to save on these network calls, then you can just keep it in memory.

play07:07

A few other things over here are that what if this server fails? If it fails,

play07:11

then this in-memory cache also fails, and we need to take care of that.

play07:15

The second thing is,

play07:17

what if this data over here on S one and this data on SS two are not consistent,

play07:22

so they are not in sync? In that case, it depends on your application.

play07:26

If it is a user profile, it's really not that big a deal. But if it's passwords,

play07:30

if it's financial information, if it's something critical critical data,

play07:33

then you can't keep this.

play07:36

So benefits of actually having it closer to the server is that it's faster here

play07:42

and it's also rather simpler to implement.

play07:43

But the benefits of having it separate are going to be seen.

play07:47

Now you can have something like a global cache,

play07:52

and since we always work with distributed systems,

play07:55

it's going to be a distributed global cache, but I'm just drawing one over here.

play07:59

The benefit of this is that all servers are hitting this global cache.

play08:03

It's limited in size. So if there's a miss,

play08:05

it's going to be querying the database, otherwise it's going to be returning.

play08:10

What's the possible benefit? Well,

play08:11

you are avoiding queries on the database and increasing queries on the cache.

play08:15

But more than anything, actually this is a faster discrete,

play08:19

so it's going to be something like Redis.

play08:21

I think a lot of people are actually asked me what this is, what is Redis?

play08:25

And this is one of the common use cases of Redis. It's like assistant storage,

play08:30

which is like a cache. Yeah. So this is a cache. Basically,

play08:34

you can keep a few boxes, which are just going to be storing memory.

play08:37

And whenever a person asks for a key, you return a value. In this case,

play08:40

what happens if S two crashes? Nothing really,

play08:43

because everyone is going to be querying this global cache anyway.

play08:45

And because it is distributed and it's going to be maintaining that data

play08:48

consistency, we can offload all that information,

play08:51

all that headache to this guy over here. Although it is slightly slower,

play08:55

it's more accurate in this choice range of placing the cache.

play08:59

I would rather place it somewhere over here as a global cache.

play09:02

The main reason for this is there's higher accuracy in the sense that if a

play09:07

server crashes, it doesn't take its data to its grave. Yeah. it's,

play09:11

it's more resilient in that way.

play09:12

The second reason is that you can scale this independently. Yeah,

play09:15

you can scale the reddest caches independently while keeping the services as

play09:19

containers running on these boxes.

play09:21

So there are certain boxes which are just running services,

play09:23

which don't have their data being eaten up by a cache.

play09:27

So my preference would be a global cache. However,

play09:29

for simplicity and for extremely fast response times,

play09:33

you can keep it in the local memory.

play09:35

We are now onto the final important characteristic of a cache,

play09:38

which decides how consistent the data in this is. And that is,

play09:42

whether it's a write through or a write back cache.

play09:47

A write through cache means that when you need to make an update to the data

play09:50

that you have, let's say you are making a update to the profile profile X,

play09:54

you hit the cache, you see that there is profile X in this.

play09:58

And so there is going to be some data inconsistency.

play10:00

If you make an update directly to the database,

play10:03

remember that person X has changed their password.

play10:05

So I'll make an update to the entry for user X,

play10:10

make that update and then push it to the database.

play10:13

So that's a write through cache I wrote on the cache before going through it and

play10:18

hitting the database. The second option, of course,

play10:21

is to hit the database directly.

play10:25

And once you have hit the database, make sure to make an entry in the cache.

play10:28

So either the database can tell the cache that this entry is no longer valid,

play10:33

or you hit the cache, you see that there's an entry for X. Just kick it out,

play10:37

evict it, delete that entry. Once it's deleted, update the database.

play10:42

When there will be a query on the cache, it's going to come here,

play10:45

it's going to see that X does not exist.

play10:47

And so it's going to pull from the database and then send it back to you.

play10:52

Okay? So this is the write through mechanism.

play10:57

I mean the original one where you hit the cache, make the update,

play11:00

and then send it to database. And the write back is where you hit the database.

play11:04

And then make sure that the cache is having a null entry.

play11:09

What is the possible problem in the write through cache? Well,

play11:11

you do the right o here, you go and write it on a database.

play11:15

But what if there are other servers having this cache in memory? Yeah,

play11:19

the profile entry for X is on SS one,

play11:22

and the profile entry for X is also on SS two.

play11:25

So this is an in-memory cache and you are doing a write through.

play11:28

So you actually update this entry. But what about these guys? So in that,

play11:32

that case, basically your write through is a little silly.

play11:34

So you kind of implement this algorithm, you have to go for a ride back.

play11:38

However, there's one issue in the ride back and that is performance.

play11:42

When you update the store, the real store,

play11:45

the one source of truth that you have,

play11:47

and you keep updating the cache based on that,

play11:49

what's going to happen is there's going to be lots of entries in the cache,

play11:51

which are fine. I mean, your consistency is not so important.

play11:54

So invalidating them is going to be expensive because whenever there's a

play11:57

request,

play11:58

you have to send a response from the database and then update the cache.

play12:01

So this is like the thrashing thing that we have. Again,

play12:03

so write back is expensive. How can we do better? As usual,

play12:08

just use a hybrid sort of mechanism.

play12:11

Write through and write back are both giving you advantages and disadvantages.

play12:14

What you can do is whenever there's an entry that has to be updated on the

play12:18

database, don't write back immediately write onto the cache.

play12:22

If it's not a critical entry,

play12:23

if it's something like the person has their comment changed, edited,

play12:27

just write onto the cache. Keep serving requests through this box.

play12:32

And when I mean there might be data inconsistency, but do you really care?

play12:36

10 seconds have passed, 20 seconds have passed,

play12:37

and when you have actually seen that, there is some scope for you to persist.

play12:41

Entries take entries in bulk and persistent onto the database.

play12:45

So that's one network call and you are saving on hitting the database

play12:49

consistently. So what we can do is instead of taking one core policy,

play12:54

we take a hybrid and then we take the best of both worlds depending on your

play12:59

scenario. If it, this is financial data. If this is passwords, you can't,

play13:01

you can't have a write through cache, you need a write back cache.

play13:04

But if it's not critical data,

play13:05

then you can go for saving network calls this way.

play13:08

So caching is extensively used in system design.

play13:10

So that's it for what caching is and what are its use cases.

play13:13

You can use it in system design extensively.

play13:15

It's actually used in real world systems.

play13:17

So if you're have any doubts or suggestions on this,

play13:19

of course you can leave them in the comments below. And if you like the video,

play13:22

then make sure to hit the like button.

play13:23

And if you want notifications for further videos like this,

play13:26

hit the subscribe button. I'll see you next time.

Rate This

5.0 / 5 (0 votes)

Related Tags
CachingSystem DesignData ManagementCache PoliciesPerformanceDatabase LoadNetwork CallsData ConsistencyIn-Memory CacheDistributed Cache