Redis Deep Dive w/ a Ex-Meta Senior Manager

Hello Interview - SWE Interview Preparation
26 Jun 202431:00

Summary

TLDRIn this deep dive, co-founder Stefan explores Redis, a versatile in-memory data structure server, ideal for caching, rate limiting, and more. He discusses Redis' simplicity, its single-threaded nature, and how it operates with keys and slots in a cluster. Stefan covers various use cases, including cache implementation, rate limiting, streams for async job queues, sorted sets for leaderboards, geospatial indexes, and pub/sub for real-time communication. The session aims to equip viewers with the knowledge to leverage Redis effectively in system design.

Takeaways

  • 😀 Redis is a versatile in-memory data structure server used for various purposes such as caching, distributed locks, and message queues.
  • 🔑 Redis operates as a single-threaded system which simplifies operation but limits multi-core utilization.
  • 🚀 The in-memory nature of Redis provides high-speed data access, typically in sub-millisecond times for basic operations.
  • 🔒 Redis can be used as a distributed lock to prevent race conditions in multi-threaded or distributed systems.
  • 📈 It supports a range of data structures including strings, hashes, lists, sets, sorted sets with range queries, and geospatial indexes.
  • 👍 Redis is simple to understand conceptually, making it easier to reason about and explain choices and their implications.
  • 🔄 Redis uses an append-only file to log commands which helps in data recovery in case of a crash, but this can be improved with replication.
  • 🔑 Keys in Redis are crucial for handling multi-node environments and are used to distribute data across the cluster through key-slot mapping.
  • ♨️ Hot keys, where traffic is unevenly distributed to certain keys, can be mitigated by appending random numbers to keys to distribute load.
  • 🗓 Redis supports various expiration policies for cache entries, including time-to-live (TTL) and least recently used (LRU) eviction.
  • 📊 The stream data structure in Redis is powerful for building async job queues and processing items in order reliably.

Q & A

  • What is the purpose of the deep dives on core technologies series?

    -The purpose of the deep dives on core technologies series is to provide in-depth knowledge necessary for success in system design interviews, focusing on concepts and technologies to ensure a thorough understanding beyond a superficial level.

  • Why is Redis chosen as a starting point for the deep dive sessions?

    -Redis is chosen as a starting point because it is an incredibly versatile technology applicable in various circumstances such as caches, distributed locks, and leaderboards, offering a lot of functionality and being relatively simple to understand.

  • What are the implications of Redis being single-threaded?

    -Being single-threaded simplifies the order of operations in Redis, making it predictable and easier to reason about. However, it also means that it does not utilize multicore systems, which could be a limitation in terms of scalability.

  • How does Redis's in-memory nature affect its performance?

    -Redis's in-memory nature allows it to be lightning fast, operating in sub-millisecond times for simple operations like sets and gets. However, it implies that data durability cannot be guaranteed as it is not persisted on disk by default.

  • What is the significance of Redis being a data structure server?

    -As a data structure server, Redis provides a key-value dictionary with values that can be various data structures like sorted sets, hashes, and geospatial indexes. This allows developers to leverage their data structures and algorithms knowledge in a distributed context.

  • How does Redis handle multi-node environments?

    -Redis handles multi-node environments through the use of keys. Each key is associated with a slot, which is a hash of the key, and the nodes in the cluster communicate to ensure that each request is routed to the correct node responsible for the key's slot.

  • What is the hot key problem in the context of Redis caching?

    -The hot key problem refers to a situation where many requests are directed to the same key, potentially overloading a single node in a Redis cluster and causing an uneven distribution of traffic, which can lead to performance issues.

  • How can Redis be used as a rate limiter?

    -Redis can be used as a rate limiter by utilizing its atomic increment command to track the number of requests made within a certain time frame. If the count exceeds the limit, additional requests are either delayed or denied until the limit resets.

  • What are Redis Streams and how can they be used in system design?

    -Redis Streams are ordered lists of items with a unique ID, typically a timestamp. They can be used to build async job queues, ensuring items are processed in order and reliably, by utilizing consumer groups and stream claiming mechanisms.

  • How do sorted sets in Redis help in maintaining a leaderboard?

    -Sorted sets in Redis allow items to be stored with a ranking value and a unique string identifier. Commands like ZADD can be used to add or update items in the sorted set, and ZRANGE can be used to retrieve the top items, making it easy to maintain a leaderboard of the most liked tweets or similar rankings.

  • What is the role of Redis Pub/Sub in system design?

    -Redis Pub/Sub allows servers to publish and subscribe to messages on channels, which can be used for real-time communication between services, such as in a chat application, where messages need to be delivered to users connected to different servers.

  • What are some considerations when using Redis for geospatial indexing?

    -When using Redis for geospatial indexing, considerations include the frequency of location updates, as static lists might be more efficient if locations don't change often. Additionally, the scalability of the index should be considered, as it is tied to a single key and node, which may require sharding strategies for large datasets.

Outlines

00:00

🚀 Introduction to Redis Deep Dive

Stefan, a co-founder of Hello Interview, introduces the first in a series of deep dives into core technologies, focusing on Redis. He outlines the purpose of these sessions to provide depth necessary for system design interviews, emphasizing the importance of understanding technologies beyond a superficial level. Redis is chosen as a starting point due to its versatility and simplicity, which can be used in various scenarios like caching, distributed locks, leaderboards, and as a Kafka alternative. The session aims to cover Redis from a developer's perspective, its internal operations, and its application in system design while highlighting potential pitfalls.

05:00

💡 Redis Basics and Internal Operations

This paragraph delves into the fundamentals of Redis, highlighting its nature as a single-threaded in-memory data structure server. The single-threaded aspect simplifies the order of operations, while being in-memory translates to high-speed performance. Redis is also characterized by its data structure capabilities, allowing for a variety of data types beyond simple strings or numbers. The explanation continues with how Redis operates, including its command-line interface and wire protocol, and touches on the importance of keys in managing multi-node environments and data distribution across a cluster.

10:02

🔑 Redis Use Cases and Design Considerations

The script discusses common use cases for Redis, such as caching to alleviate database load and improve performance, and using it as a rate limiter to protect services from excessive requests. Each use case comes with its own set of considerations, like the 'hot key' problem in caching and the mechanics of enforcing rate limits. The paragraph also introduces Redis' stream feature, which allows for the creation of async job queues and processing items reliably and in order, with a focus on the importance of designing an effective keyspace to ensure even data distribution and avoid traffic bottlenecks.

15:03

🛠 Advanced Redis Features and System Design

Advanced Redis features are explored, including its stream data structure, which can be used to create ordered lists of items with unique IDs, and its capabilities for building async job queues and consumer groups. The paragraph explains how Redis streams can be used to distribute work among multiple workers fault-tolerantly. It also touches on other data structures like sorted sets for leaderboards and geospatial indexes for location-based searches, discussing their utility and potential limitations in system design, such as scaling challenges and the need for thoughtful key distribution.

20:04

📍 Geospatial Indexing and PubSub in Redis

This section focuses on Redis' geospatial indexing, which allows for the addition of items with location coordinates to an index and subsequent searches within a specified radius. The capabilities and potential limitations of this feature are discussed, including the need for careful consideration of data distribution across nodes. The paragraph also introduces Redis' pubsub feature, which enables servers to communicate with each other efficiently, such as in a chat room scenario, where messages can be published to specific topics to reach intended recipients, despite being connected to different servers.

25:06

🌟 Wrapping Up the Redis Deep Dive

Stefan concludes the deep dive by summarizing the versatility of Redis across various applications like geospatial indexes, leaderboards, and distributed locks. He emphasizes the importance of understanding Redis internals to troubleshoot and optimize its use in production environments. The session aims to equip viewers with the knowledge to make informed decisions about Redis's role in their system designs. Stefan invites feedback and questions, and mentions that a written version of the deep dive is available for further reading.

Mindmap

Keywords

💡Redis

Redis is an open-source, in-memory data structure store, used as a database, cache, and message broker. It supports various data structures such as strings, hashes, lists, sets, and more. In the video, Redis is the central theme, discussed for its versatility and performance in system design, particularly for use cases like caching, distributed locks, and message passing.

💡Single-threaded

Single-threaded refers to the operation of a program or system that uses only one thread of execution at a time. Redis is described as single-threaded in the script, which simplifies its operation as it processes commands in the order they are received without the complexity of multi-threading, although it does not utilize multi-core systems to its full potential.

💡In-memory

In-memory computing involves processing data directly in the RAM, which is faster than accessing data from a disk. Redis is an in-memory datastore, meaning it keeps all data in RAM, allowing for high-speed data access and operations, as discussed in the script, which is crucial for performance-critical applications.

💡Data structure server

A data structure server is a type of database that focuses on storing and managing specific data structures. Redis is highlighted as a data structure server in the script, emphasizing its ability to handle complex data structures like sorted sets, hashes, and geospatial indexes efficiently.

💡Cache

Caching is a technique used to improve performance by storing copies of frequently accessed data in a faster-to-access storage system. In the script, Redis is mentioned as an ideal caching solution due to its speed and the ability to handle large volumes of read requests quickly.

💡Distributed locks

Distributed locks are a synchronization mechanism used in distributed systems to ensure that a shared resource is accessed by only one process at a time. The script discusses Redis' capability to implement distributed locks, which is crucial for maintaining data consistency in multi-node environments.

💡Pub/Sub

Pub/Sub stands for publish-subscribe, a messaging pattern where the publisher sends messages to a channel without knowing who the subscribers are. Redis Pub/Sub is mentioned in the script as a way for servers to communicate with each other efficiently in scenarios like chat applications.

💡Geospatial index

A geospatial index is a data structure that allows for the efficient querying of items based on their geographical location. Redis provides a geospatial index feature, which is discussed in the script as a powerful tool for applications that require location-based searching and proximity analysis.

💡Stream

In Redis, a stream is a data structure that allows for the storage and manipulation of messages in a log-like structure. The script explains how Redis streams can be used to build asynchronous job queues and handle message passing in a distributed system, providing a way to maintain order and reliability.

💡Key-space

Key-space in Redis refers to the set of all keys in a Redis database. The script discusses the importance of key-space in Redis clustering, where the distribution of keys across different nodes affects performance and scalability.

💡Sharding

Sharding is the process of distributing data across multiple machines. In the context of Redis, sharding is achieved by partitioning the key-space, as mentioned in the script, which allows for horizontal scaling and improved performance under load.

💡Rate limiter

A rate limiter is a tool used to control the amount of incoming and outgoing traffic to prevent overload. The script describes using Redis as a rate limiter by leveraging its atomic increment command to prevent a service from being overwhelmed by too many requests in a given time frame.

💡Sorted set

A sorted set is a data structure that stores unique items in a sorted order. In the script, Redis sorted sets are discussed as a way to maintain a leaderboard, allowing for efficient retrieval of the top items based on a certain score, such as the number of likes on a tweet.

Highlights

Redis is introduced as a versatile technology for various uses such as caching, distributed locks, and leaderboards, with potential as a Kafka replacement.

Redis is praised for its simplicity in understanding and its ability to be reasoned through, unlike complex SQL query planners.

Redis operates as a single-threaded in-memory data structure server, which simplifies operation order and enhances performance.

The in-memory nature of Redis allows for extremely fast operations, particularly beneficial for simple tasks like sets and gets.

Redis functions as a data structure server, providing a key-value dictionary with support for various complex data structures.

Redis can be installed simply via Brew and operated using a CLI, with a basic wire protocol for command operations.

The importance of keys in Redis for handling multi-node environments and ensuring data distribution across the cluster is emphasized.

Strategies to avoid the 'hot key' problem in Redis, such as appending random numbers to keys for load distribution, are discussed.

Redis is commonly used as a cache to reduce database load, with considerations for addressing hotkeys and expiration policies.

Redis supports different expiration policies, including time-to-live and least recently used mechanisms for cache eviction.

Redis can be effectively used as a rate limiter with atomic increment commands to control the flow of requests to services.

The stream data structure in Redis is highlighted as a powerful primitive for building async job queues and ensuring ordered processing.

Redis streams and consumer groups offer fault-tolerant distribution of work among workers with automatic claiming and reclamation of tasks.

Sorted sets in Redis are useful for maintaining leaderboards and can efficiently manage and retrieve top-ranked items.

Geospatial indexes in Redis allow for location-based searches and are beneficial for services requiring location awareness.

Pubsub in Redis is presented as a solution for server-to-server communication, such as in chat applications, with at-most-once delivery.

The potential system design considerations when using Redis, such as handling failures and ensuring message delivery, are outlined.

Redis is positioned as a valuable tool in system design with its straightforward internals and wide-ranging applicability.

Transcripts

play00:01

so hello and welcome to the first of

play00:03

what we hope will be many deep Dives on

play00:05

core Technologies today we're going to

play00:07

be talking about redis I'm Stefan I am

play00:10

one of the co-founders of hello

play00:11

interview I've conducted north of 2,000

play00:14

interviews in my career and our hope

play00:17

with these sessions is to give you some

play00:18

of the depth necessary to be successful

play00:21

in system design interviews system

play00:23

design is certainly about solving

play00:25

problems from end to end but the

play00:27

interviewers in these sessions are going

play00:30

to be asking you questions to try to

play00:32

test your knowledge to make sure that

play00:34

you don't just have a cursory

play00:35

understanding of the concepts and the

play00:38

technologies that are involved and so

play00:41

what we're trying to do here is to give

play00:42

you some of that Foundation or at least

play00:44

give you a start to that Foundation if

play00:46

you don't have it and reddis is a great

play00:49

starting point so why are we learning

play00:51

about reddis to begin with well redis is

play00:54

an incredibly versatile technology you

play00:56

can use it in a number of different

play00:57

circumstances from caches distrib locks

play01:00

leaderboards it can be used as a

play01:02

replacement for Kafka in certain

play01:04

instances and so there's a lot of bang

play01:06

for your buck with learning about redus

play01:08

and how it can apply to various system

play01:10

design problems the second reason that

play01:13

we really prefer redus in a lot of

play01:15

circumstances is because it's very

play01:16

simple to understand the conceptual

play01:19

model of redus is actually quite simple

play01:23

and that means that you can both explain

play01:25

your choices but also understand their

play01:27

implications if you need to dive into

play01:29

the details of how a SQL query planner

play01:31

is operating in postgress forget it

play01:34

you've got to learn 30 years of history

play01:37

in order to get there meanwhile if you

play01:39

need to know why a rdus query is slow

play01:42

it's actually pretty straightforward for

play01:43

you to reason through and so redus and

play01:46

particularly this session is hopefully a

play01:48

really good return on your time so let's

play01:51

talk about redis starting with what it

play01:53

looks like from a enduser or developer

play01:56

perspective then we'll talk about how it

play01:59

actually oper Ates underneath the covers

play02:01

and finally we'll merge the two to tell

play02:03

you about how you can use it in various

play02:05

system design scenarios and some of the

play02:08

pitfalls that you might encounter uh in

play02:10

those uh various problems so the first

play02:13

thing that you should know about redus

play02:15

is it is a single-threaded inmemory data

play02:18

structure server all of those words

play02:20

actually mean something single-threaded

play02:23

is unusual to hear in distributed

play02:25

systems mostly because it fails to

play02:27

utilize multicore systems but it

play02:30

actually simplifies things quite a lot

play02:32

in many databases the order of

play02:35

operations is hard to uh Gro and in

play02:39

redus it's quite simple the first

play02:41

request that comes in is the first one

play02:43

that actually runs and everybody else

play02:46

Waits this has some dramatic

play02:48

implications for how you actually use

play02:50

redus but it's a great simplifier for

play02:53

how it's working under the covers the

play02:55

second thing to know about redus is it

play02:57

is in memory that means it is lightning

play02:59

fast and and can operate in sub

play03:01

millisecond times especially for simple

play03:04

operations like sets and gets it also

play03:07

has some implications for how you use it

play03:09

in your system you can't necessarily

play03:11

guarantee the durability of data uh but

play03:14

it means you can use redus in places

play03:16

that you might not others with a SQL

play03:18

database you need to make sure that you

play03:20

batch all of your requests together or

play03:23

you run risks of that n plus1 problem

play03:26

with reddis you can fire off a th000

play03:29

requests and the server will happily

play03:31

return its results to you it it really

play03:34

changes how you think about operating

play03:36

with a database the last piece is that

play03:39

reddis is a data structure server what

play03:41

does that mean well the core of redus is

play03:44

this kind of key value dictionary but Rd

play03:47

values don't necessarily need to be

play03:49

strings or numbers or binary blobs they

play03:52

can also be sorted sets or hashes or

play03:56

geospatial indexes or Bloom filters the

play04:00

way that reddis operates is trying to

play04:02

mirror what you might think of from a

play04:04

data structures and algorithms course

play04:06

that you learn about a bunch of really

play04:08

primitive data structures and reddis

play04:10

just gives you a way to use them in a

play04:13

distributed fashion which can be super

play04:15

convenient because it means you can take

play04:17

all of the knowledge that you've

play04:18

accumulated in building up data

play04:20

structures and algorithms expertise and

play04:23

lift it into a system design context

play04:26

which is quite nice now how does redus

play04:29

actually operate it's really simple you

play04:31

can Brew install redus on your laptop

play04:34

and actually get a CLI that you can use

play04:38

and the wire protocol is very basic it

play04:40

accepts commands like what you see here

play04:41

so I can call set and I set it I call

play04:43

get and I get it I call incor and it

play04:46

increments it

play04:47

atomically each of these operates on a

play04:50

key remember that the core of redus is

play04:53

this dictionary of keys to these values

play04:56

and the commands are basic basically

play05:00

ways to manipulate the various data

play05:02

structures if you look in the Rus

play05:03

documentation they are organized by the

play05:06

type of the data so it wouldn't make

play05:07

sense for instance to call incor on a

play05:10

hash but there's another command that

play05:13

will allow you to increment the values

play05:15

of a hash now these commands can get

play05:18

pretty sophisticated so here's an

play05:19

example of a way to add an item to a

play05:22

stream that's not really important just

play05:24

yet the important thing is that you have

play05:26

a command type in this case the ad

play05:29

operates on a stream that's what the X

play05:31

prefix is and then there's the key

play05:33

that's where it's operating now why do

play05:36

the keys matter well the keys are really

play05:39

how redus handles uh the multi- node

play05:43

environments so you can just run redus

play05:46

on a single node it runs in a single

play05:48

thread and it can basically write the

play05:51

commands that successfully execute out

play05:53

to dis you can configure the interval

play05:56

the default I think is every 1 second

play05:58

which basically means redis can lose

play06:00

some data but in the ideal scenario what

play06:03

will happen if the process fails is

play06:06

redus will go down and when it comes

play06:08

back up it will read from that file and

play06:10

recover somewhat gracefully in practice

play06:12

this is horrible so for most people what

play06:15

they're going to do is actually have a

play06:17

replica and the way that this works is

play06:19

you have some main or master and then

play06:22

you have a secondary that's reading from

play06:24

that append only file or basically from

play06:27

the log this works much like change data

play06:30

capture when a command runs successfully

play06:32

here it gets it gets run on the

play06:34

secondary there is some special behavior

play06:37

in redus where if there's been 5 minutes

play06:39

or an hour and I haven't gotten an

play06:41

update then the secondary will basically

play06:44

fully rebuild from the contents of uh

play06:48

the master or main but in most instances

play06:51

they're going to and ideally they're

play06:53

going to be caught up to one another but

play06:55

that's not very interesting that really

play06:57

constrains us to the throughput of of

play06:59

this single instance we can I guess

play07:01

technically scale our readr put

play07:04

indefinitely if we want to put a bunch

play07:06

of replicas but how do we scale out

play07:07

rights and this is where the keyspace

play07:11

starts to come into the picture so redus

play07:13

has an internal concept they call a slot

play07:16

which is basically a hash of the key I

play07:19

think it's usually a CRC modulo some

play07:23

number 16 384 I think and that is the

play07:27

slot that that key occupies

play07:30

theoretically when the cluster isn't

play07:32

resizing a single master or main will

play07:36

own that slot and clients should be

play07:40

aware of all of the nodes in the cluster

play07:43

so if I have a request for Foo I'm going

play07:47

to take the hash of Foo the key I'm

play07:50

going to look up the slot that it

play07:52

occupies and then decide which node in

play07:54

the cluster I need to route my request

play07:57

to each of the nodes in the clust ER

play07:59

communicates via a gossip protocol so

play08:02

they all know about the existence of

play08:04

each other as well as which keys they

play08:07

have which slots they have and if you

play08:09

make a request to the wrong host it will

play08:11

tell you that key doesn't exist here

play08:13

it's been moved but for performance sake

play08:17

it's way more beneficial if the client

play08:19

knows exactly which host to go to and so

play08:22

that's why when you start up a client in

play08:24

redus you make it aware of all of the

play08:26

hosts that are available you the ically

play08:29

aren't making requests that bounce

play08:31

across uh many different notes and so

play08:34

this is where the keyspace becomes

play08:36

incredibly important the only way to

play08:38

Shard redus is through choosing your

play08:42

keys and then when you are choosing how

play08:44

to Shard you're functionally choosing

play08:47

how to spread your data across the

play08:48

keyspace so let's take a canonical

play08:51

example for a moment if you have a cache

play08:55

one of the major difficulties that you

play08:57

usually have is what's called a hot key

play08:59

problem where many of your requests are

play09:02

going to the same key well how does this

play09:05

break reddis if one of your keys is

play09:08

located on this node and the aggregate

play09:12

traffic to this node exceeds what this

play09:14

node can handle then it doesn't matter

play09:17

that these other hosts are out there the

play09:20

uneven distribution of traffic is going

play09:22

to kill you so what can you do well with

play09:25

redis one of

play09:27

the very common pattern is to Simply

play09:30

append a random number to the key so

play09:33

that way you write the key to multiple

play09:35

hosts and you can read it from multiple

play09:37

hosts so this provides a really crude

play09:41

way to distribute the load across the

play09:43

cluster but if you're thinking about how

play09:45

you scale redus you should be thinking

play09:47

about your key space this is really

play09:50

essential to how you reason about how uh

play09:53

redus

play09:54

scales okay so let's walk through some

play09:57

common instances of using red

play09:59

in actual designs and talk about the

play10:01

types of things that might come up in

play10:03

your deep Dives and conversations with

play10:05

your interviewer so the first use of

play10:07

redus which is the most common is to use

play10:10

it like a cache and the usage here is

play10:12

pretty simple but let me just explain it

play10:14

quickly you have a database where you

play10:16

need to make heavy queries maybe they're

play10:18

analytic queries maybe this is a deep

play10:20

search index whatever and you want to be

play10:23

able to scale this so what do you do

play10:25

well we're going to create a redis

play10:28

cachee off to the side and our service

play10:30

is first going to check the cache it's

play10:31

going to be very very fast see if

play10:33

there's an entry in there for whatever

play10:35

query we're going to make if there isn't

play10:37

we're going to go to the database grab

play10:39

it and then store the result in the

play10:41

cache if there is we're just going to go

play10:43

and take the result this is appropriate

play10:45

in basically any case where you can

play10:48

tolerate some staleness or inconsistency

play10:51

this is great when you have an

play10:52

eventually consistent system and in a

play10:54

lot of cases that's appropriate there

play10:56

are some places where this obviously

play10:57

doesn't matter now you set up a cash

play10:59

like this you've really got two concerns

play11:01

that you need to address the first is

play11:03

the hotkey issue which I just talked

play11:05

about earlier and so we need to make

play11:07

sure that our cash is spreading out the

play11:09

load amongst all of the instances as

play11:12

much as possible the way we do this in

play11:14

rdus is by assigning keys and so we

play11:17

might append values to our keys such

play11:21

that we are evenly Distributing the

play11:24

requests across all of our redus cache

play11:28

the next thing that you need to be

play11:29

consider concerned with is expiration

play11:31

and a common question in interviews is

play11:33

what's the expiration policy of your

play11:35

cash there's a bunch of different

play11:37

options here and reddis fundamentally

play11:39

supports them all the most common way to

play11:41

use reddis is to use the expire command

play11:45

or to basically attach arguments to your

play11:47

sets and gets and what that does is

play11:50

after a certain time that item won't be

play11:53

readable and so basically you can say

play11:56

expire after you know 60 seconds if

play11:59

that's your cash time to live and then

play12:01

you can guarantee that anyone reading

play12:03

out of your cash won't read that stale

play12:05

data another way to configure R redus is

play12:08

in its uh least recently used setup and

play12:11

this works a little bit differently the

play12:13

way that the least recently used setup

play12:16

Works inside redis is basically you'll

play12:19

continue to be able to append Keys into

play12:21

your cache and functionally indefinitely

play12:24

until you run out of memory and then at

play12:27

that point redis will start to evict

play12:30

those least recently used Keys uh from

play12:33

your setup works very much like memcache

play12:36

and in many cases there a drop in

play12:38

replacement but a reasonable

play12:41

Choice another use of reddis which is

play12:44

quite simple and quite popular is to use

play12:47

it as a rate limiter the way this works

play12:50

is we want to guard this expensive

play12:52

service from getting lots of requests

play12:55

maybe our Downstream has decided that

play12:58

they can only accept five requests every

play13:01

60 seconds so what we need to do is make

play13:04

sure that if we have multiple instances

play13:06

of this service that they aren't all

play13:09

making in aggregate more than five

play13:11

requests over 60 seconds so how do we do

play13:13

this well I talked about the atomic

play13:16

increment command earlier on in the

play13:19

session and what it does is it

play13:20

increments a key if it exists if it

play13:22

doesn't exists it'll set it to one and

play13:25

it will return the final value think of

play13:27

it like plus plus variable if you're

play13:29

familiar with C++ or or other languages

play13:32

that support that syntax the idea is if

play13:36

this value is over the limit in my case

play13:39

five then I don't want to make that

play13:41

request I need to wait on the other hand

play13:44

if it's under that that means that I've

play13:46

got space and I can proceed with the

play13:48

request and then the next thing I'm

play13:50

going to do if I actually get an

play13:52

opportunity to make that request is I'm

play13:53

going to make sure that this key gets

play13:55

expired so I'm going to say that I need

play13:58

to expire this key

play13:59

after 60 seconds and what this is going

play14:01

to do in practice is it's going to allow

play14:05

up to end requests to proceed through

play14:08

and then after however many seconds in

play14:10

this case I've set it to a minute has

play14:13

finished then that key gets

play14:14

automatically zeroed out and subsequent

play14:17

Services can make a request again now

play14:20

there is some mechanics that you'll have

play14:21

to insert here the service basically

play14:23

needs to make redundant calls um after

play14:27

it gets rate limited because needs to

play14:29

check in again when the rate limit has

play14:32

expired and this also doesn't behave

play14:34

particularly well when the rate limits

play14:37

are under extreme stress so if I had

play14:40

maybe 60 requests that I needed to make

play14:42

it means that all of my services are

play14:44

going to be hitting redit at the same

play14:46

time and I don't have any specific

play14:48

ordering that is going to be enforced so

play14:50

I might starve one service so there's a

play14:52

lot to talk about here in a system

play14:55

design interview between how you Asser

play14:58

ass ERT fairness of your services how

play15:01

you set the limits what's most

play15:03

appropriate and this is the most basic

play15:06

implementation of a rate limiter there's

play15:08

a lot of other structures that you could

play15:10

potentially use that include windows and

play15:12

give clients some idea about when they

play15:14

might be next in line so on and so forth

play15:17

so design your rate limiter as you see

play15:19

fit but keep in mind there's a lot of

play15:21

logistics that can go into this

play15:22

depending upon the needs of your system

play15:24

sometimes something simple is great so

play15:27

far we've talked mostly about use cas

play15:29

cases of redus that rely on its key

play15:31

value nature maybe with some expiration

play15:33

mechanics but the value of reddis and

play15:36

its data structure server is primarily

play15:39

that these data structures can become

play15:41

increasingly powerful and I want to talk

play15:43

about a few instances of that the most

play15:46

powerful and honestly the most

play15:47

complicated primitive that I think redus

play15:49

offers is its stream and the idea behind

play15:52

a stream is pretty simple there was

play15:53

actually a Kafka paper that was written

play15:55

I think by LinkedIn many years ago that

play15:58

talks about the

play15:59

of distributed append only logs in

play16:02

system design so you can imagine Redd

play16:05

streams as basically order lists of

play16:08

items they're given an ID which is

play16:09

usually a Tim stamp as they're inserted

play16:12

and each of these items can have their

play16:13

own keys and values think of them like

play16:15

Json objects or

play16:17

hashes and the nice thing about this

play16:19

stream is if we've got something that we

play16:22

need to make sure all of the items of

play16:25

the stream are processed then rdus gives

play16:28

us a bunch of of Primitives to work with

play16:30

this so let's talk about if I needed to

play16:33

build an async job queue so what I want

play16:36

to be able to do is insert items onto

play16:39

that queue and have them processed in

play16:41

order and reliably I want to make sure

play16:44

that if an item is uh inserted onto the

play16:47

queue that it will eventually get

play16:49

processed what I'm going to do for

play16:51

storing these items is I'm going to

play16:53

create a stream I'll put each of those

play16:55

items on the stream as they're created

play16:57

and then next I'm going to create this

play16:59

thing called a consumer group you can

play17:01

think of a consumer group like a pointer

play17:03

into the stream and that pointer

play17:06

basically defines where we're at and so

play17:09

a consumer group will keep track of

play17:11

where in the Stream it has to keep

play17:14

processing the workers can each query

play17:17

the consumer group for unallocated items

play17:20

so if the consumer group happens to be

play17:22

pointing at item two and no workers have

play17:25

picked that one up then that item is

play17:27

going to be allocated to that worker

play17:30

there's a final notion within streams

play17:33

and consumer groups around what redus

play17:36

calls claiming and the idea is at any

play17:39

given moment only one of these workers

play17:41

can have claimed an item on the consumer

play17:45

group and if for instance the worker

play17:48

fails which it's apt to do then that

play17:52

item can be reclaimed by the consumer

play17:55

group and allocated out to an additional

play17:57

worker so the the idea behind a reddish

play18:00

stream is it gives you a way to

play18:01

distribute work amongst many workers in

play18:04

a way that's fault tolerant partially

play18:07

because you you have all that normal

play18:08

caveats about

play18:10

redis and that is very um very fast and

play18:14

so you don't have to have a bunch of

play18:16

latency that you insert into the process

play18:19

now what does this mean in the system

play18:21

design setting well there are a bunch of

play18:24

considerations that you should figure

play18:25

out before you go and implement this the

play18:28

first is is you need to be able to

play18:29

handle failures of redus and there are

play18:33

options like using a fork of redus like

play18:35

memory DB that will give you more

play18:37

reliability you also can build some

play18:39

redundancy in by having replications or

play18:43

additional shards basically by

play18:45

additional Keys you'll also want to

play18:48

figure out how you can keep these

play18:50

workers allocated the right work the

play18:53

typical way this is accomplished is that

play18:55

the worker while it's processing an item

play18:58

will continue to heartbeat back to the

play19:00

consumer group letting it know hey I'm

play19:02

still working on this and that way the

play19:05

consumer group isn't snatching back the

play19:08

work item uh before it's had a chance to

play19:10

finish it but there are definitely

play19:12

scenarios in distributed system where

play19:14

this doesn't work as an example if

play19:16

worker 2 loses network connectivity to

play19:19

the consumer group but maybe still has

play19:21

it to a database or a downstream system

play19:24

it might continue to process that item

play19:26

while the consumer group reclaims it and

play19:29

hands it off to another worker so the

play19:32

behavior here is that your consumer

play19:34

group is going to offer at least once

play19:36

guarantees but it's not going to

play19:38

guarantee its process exactly once and

play19:41

in some cases that may be exactly what

play19:43

you need let's talk about two more data

play19:45

structures that are really helpful in

play19:48

system design

play19:49

contexts for this application let's

play19:52

pretend we want to keep a leaderboard of

play19:54

the top five most liked tweets that

play19:57

contain the word tiger

play19:59

if you look at our tweet search answer

play20:02

key you'll see a bit about where this

play20:03

might come up so the sorted set commands

play20:07

all start with z and their syntax is

play20:11

quite simple I give a key the ranking

play20:14

value in this case the number of likes

play20:17

and then Su string identifier in this

play20:19

case I'm going to have the Tweet ID

play20:21

we'll call it Su ID 1 this first tweet

play20:23

let's assume it's a tweet about Tiger

play20:25

Woods in the next tweet we're going to

play20:28

do the same thing but for some ID 2 this

play20:30

might be a tweet about Zoo tigers and

play20:33

we're going to add the ID 1 remember I

play20:37

called these sorted sets which means for

play20:39

any given ID it can only have one

play20:42

ranking value so if for instance the

play20:45

Tiger Woods tweet got an additional like

play20:47

I would run the same command with 501 to

play20:50

update it now I can run this command Z

play20:54

range by rank to remove items that are

play20:59

in a specific rank or range of ranks and

play21:03

in this case what I'm doing is I'm

play21:05

removing all but the top five so what I

play21:09

effectively managed here is a heap I've

play21:12

got the top five most like tweets and

play21:15

every time I add a new tweet I can

play21:18

remove the ones that are not in there

play21:20

and I will only actually get an

play21:22

incremental uh example in this list when

play21:26

one of those new tweets Rises to a

play21:28

number of likes that is greater than my

play21:31

top five now it's important to remember

play21:34

that these run in typical sorted list

play21:36

times so as an example when I'm adding

play21:40

things to the list I basically am eating

play21:42

a log M where m is the number of items

play21:46

in the list uh complexity and so being

play21:49

able to run this command keeps my list

play21:52

small and manageable I wouldn't want to

play21:54

maintain an indefinitely growing list of

play21:57

the tweets that contain the word tiger

play22:00

because then every incremental addition

play22:03

needs to be inserted into that sorted

play22:05

list and while it's logarithmic it still

play22:07

is growing as the number of examples

play22:10

grows now this sounds like a pretty good

play22:12

idea but remember I'm doing this in a

play22:15

single key so in in so far as I need to

play22:19

manage uh the top like tweets by

play22:22

specific keywords I can probably

play22:25

distribute this across a cluster by

play22:27

having unique

play22:29

keys but if I was using this to keep

play22:31

track of all of the tweets and try to

play22:34

find the most like tweets across my

play22:36

entire uh setup then I might have a

play22:39

problem because they're all going to sit

play22:41

on a single node if I want them to sit

play22:44

on multiple nodes I'm going to need to

play22:46

keep multiple sorted sets and then

play22:49

combine them at the last minute so what

play22:51

I would basically do is take some hash

play22:54

of the Tweet ID and then I would keep a

play22:57

sorted set

play22:59

of the items that have that hash and

play23:02

then when I wanted to query this I would

play23:04

query all of those sorted sets across my

play23:07

cluster I would need to issue however

play23:10

many queries for the number of shards

play23:12

maybe 16 queries uh in practice this is

play23:15

usually not a big deal especially

play23:17

because redus happens to be so fast but

play23:20

it certainly does have limitations on on

play23:22

scaling and and how you build your

play23:24

systems once we have the sorted set

play23:27

primitive there's a lot of things that

play23:28

we can build on top and reddis goes out

play23:31

of their way to implement what I think

play23:32

is one of the most useful uh geospatial

play23:36

index the use cases for this vary but

play23:40

essentially if you have a big list of

play23:42

items that have locations and you

play23:44

wouldn't be able to search them by

play23:46

location this is a great way to do it

play23:49

and so the API looks pretty basic when I

play23:53

want to add an item to my geospatial

play23:55

index I'm going to pass in a longitude

play23:58

and in latitude and an identifier for my

play24:01

item and that's going to add it to the

play24:03

index at this key when I want to search

play24:06

that index I run the Geo Search Command

play24:09

give it the key I'm going to give it a

play24:12

an anchor point and then a radius and I

play24:15

can optionally return the distances that

play24:18

are associated with it and this works

play24:20

exactly like you'd expect if I have a

play24:23

bike rental service and I want to find

play24:24

all the nearby stations I add the

play24:27

stations to my GS spatial index and

play24:29

query the way that this works under the

play24:31

covers is that each of these longitudes

play24:34

and latitudes are geohash to give them

play24:37

basically a numeric identifier that

play24:39

numeric identifier is the ranking in the

play24:42

sorted set and then reddis under the

play24:44

covers is calculating the bounding boxes

play24:47

given your radius and then finding all

play24:50

of those entries that are basically

play24:52

within that range in your sorted set Rus

play24:57

does also in this case ensure that

play24:59

you're not returning items outside that

play25:01

radius so you theoretically wouldn't see

play25:03

an entry with a radius of six but the

play25:06

important thing here probably isn't the

play25:08

internals the important thing is that

play25:10

this API is super convenient and works

play25:13

in a wide variety of different

play25:14

situations now from a system design

play25:17

perspective this isn't without its

play25:19

perils there's a number of cases where

play25:22

you might not want to do this the first

play25:24

is if your items aren't changing

play25:27

location very very often if you're not

play25:29

updating your data it may actually be

play25:32

better to just keep a static list of

play25:35

longitudes and latitudes in the service

play25:38

that's actually making these queries and

play25:41

then just calculate the have sign

play25:43

distance for all of them if for instance

play25:45

I have a th000 stores across the globe

play25:48

that's actually not that much to do some

play25:51

simple uh floating Point arithmetic to

play25:54

go and find the closest place it's

play25:56

certainly faster than making a network

play25:57

call out to Rus the other instance where

play26:00

this can have some trouble is that

play26:02

remember that the index is on a single

play26:05

key which means it's on a single node if

play26:08

I need to go and separate this out and I

play26:10

need to go and Shard this then I've got

play26:12

to decide on some way to do that there's

play26:15

a lot of natural ways to do this I can

play26:17

calculate the geohash on my side I can

play26:19

take some of the most significant bits

play26:21

and use those as part of the key I can

play26:23

break this out by continent where

play26:25

frequently I'm not going to need a query

play26:27

across continents or or if I'm near the

play26:29

border maybe I'll query to maybe uh

play26:32

North America and South America but

play26:34

generally speaking I need to be

play26:36

concerned with how this scales if I've

play26:38

got say billions of examples the last

play26:41

capability of Redd that I'm going to

play26:42

talk about today is pubsub and pubsub is

play26:45

really trying to solve for this unique

play26:48

instance where your servers need to be

play26:50

able to address one another in a

play26:52

reasonable fashion canonical example of

play26:55

this would be a chat room where uh us

play26:58

user one is connected to server a maybe

play27:00

via a websocket or some persistent TLS

play27:03

connection and they need to message user

play27:06

3 who unfortunately is connected to

play27:09

server C how does server a know user 3

play27:13

is on server C there's a bunch of

play27:15

different potential solutions to this

play27:17

problem probably the most famous is to

play27:19

use a consistent hash ring so that way

play27:21

user 3 is always allocated to server C

play27:24

and server a knows that and so can send

play27:27

messages directly there but there's a

play27:29

bunch of incremental problems that

play27:31

happen with these consistent hash Rings

play27:33

notably it's harder to to manage the

play27:35

balance between servers and scaling them

play27:38

up and down requires a bunch of very

play27:40

careful orchestration usually with a

play27:42

service something like zookeeper but

play27:44

redus has this uh capability called

play27:47

pubsub and the idea with pubsub is that

play27:49

the servers can connect to redus and

play27:52

announce a a publication uh that they're

play27:56

going to be making and then on that

play27:59

topic other servers can subscribe so if

play28:03

for instance user one connects to server

play28:07

a server a is going to tell reddis that

play28:11

I have user one any messages sent to the

play28:13

topic for user one should come back to

play28:16

me and server C would do the same thing

play28:20

now when server a wants to send a

play28:22

message to server C so that user 3 gets

play28:26

it what it's it's going to do is publish

play28:29

to the topic of user 3 redis pubsub is

play28:34

at most once delivery so the idea here

play28:39

is that the message might get to a user

play28:41

and it might not which is surprisingly

play28:45

useful in spite of its reliability

play28:49

issues what it typically means from a

play28:51

system design perspective is if you need

play28:53

to guarantee that messages eventually

play28:55

arrive you're going to have to try

play28:57

something else

play28:59

but Reda pubsub is going to be very very

play29:01

fast it's operating on a single box and

play29:03

all it's doing is bouncing these

play29:05

requests between different services and

play29:08

so it can scale quite well and what this

play29:11

allows us to do is if we wanted to we

play29:13

could swap user one and user 3 they

play29:15

could connect to different hosts and

play29:18

redis pubsub would be the registry that

play29:20

knew which host each user was connected

play29:23

to this is something that's a little bit

play29:25

harder to pull off with for instance a

play29:28

consistent hash ring and it also means

play29:31

if server a were to go down we can

play29:34

migrate user 3 to server B quite quickly

play29:38

because server B will now register its

play29:41

publication to that topic and for a

play29:43

while those messages will go both to

play29:45

server a and server B but they'll

play29:48

eventually get their way to user 3 like

play29:52

I said there's a bunch of considerations

play29:54

with this I think the WhatsApp key that

play29:56

we have on the site talks about some of

play29:58

them in depth but be happy to answer

play30:01

questions in the comments uh if you've

play30:04

got them overall thank you so much for

play30:07

listening in on this deep dive on redus

play30:10

redus is useful across the board in a

play30:12

number of different instances from

play30:14

geospatial indexes leaderboards

play30:16

distributed locks and so it probably has

play30:19

a place in your system designs and I

play30:22

hope this presentation this video has

play30:24

really shown you that the internals of

play30:27

redus aren't that complicated and if you

play30:30

understand them then you can reason

play30:31

through some of the issues that you

play30:33

might expect in

play30:35

production so that's a wrap on our

play30:37

reddest Deep dive would love to get any

play30:39

feedback that you all have in the

play30:41

comments any questions or followups if

play30:44

you'd like to read more our redest Deep

play30:46

dive write up is available in the

play30:49

description below and we're looking

play30:51

forward to making more of these for you

play30:53

all if you find them interesting so let

play30:54

us know what you'd like to see next

play30:56

thanks I'm Stefan have a great day

Rate This

5.0 / 5 (0 votes)

関連タグ
RedisSystem DesignIn-MemoryData StructuresCacheRate LimiterLeaderboardsGeospatial IndexPubSubInterview PrepDeep Dive
英語で要約が必要ですか?