Google SWE teaches systems design | EP20: Coordination Services
Summary
TLDRIn this video, the presenter dives into the concept of coordination services in distributed systems, emphasizing their role in maintaining shared state and cluster configuration. Key examples like ZooKeeper and etcd are mentioned, highlighting their use of consensus algorithms for high availability and eventual consistency. The video also explores methods to achieve strong consistency, such as reading from the leader, using the sync command, and the emerging concept of quorum reads. The presenter wraps up by stressing the importance of coordination services in modern data storage systems and the trade-offs between consistency and performance.
Takeaways
- 🏋️ The video is about systems design, focusing on coordination services in distributed systems.
- 📚 The speaker plans to discuss HBase, a type of database, but first needs to cover coordination services and related technologies.
- 🌐 Coordination services are essential for maintaining shared state or information in a cluster of nodes in a distributed system.
- 🔑 They include details like IP addresses, node partitions, and the status of nodes (alive or down), as well as distributed locks.
- 🤹♂️ Examples of coordination services mentioned are ZooKeeper and etcd, which provide centralized systems for metadata about the cluster.
- 📈 These services are designed for read-heavy workloads, with a focus on eventual consistency rather than strong consistency for efficiency.
- 🔒 Coordination services ensure data is highly available through replication and consensus algorithms like Paxos, Raft, or Zab.
- 🔄 Monotonic reads are supported to prevent time from appearing to move backward, ensuring reads progress forward.
- 👀 Watches can be attached to keys or files to receive notifications if the data changes, similar to serializable snapshot isolation in databases.
- 🔑 Three methods for achieving strong consistency in coordination services are discussed: reading from the leader, using the 'sync' operation, and exploring quorum reads.
- 🚀 The importance of coordination services in modern data storage systems is highlighted, noting their role in maintaining system integrity and performance.
Q & A
What is the main purpose of a coordination service in a distributed system?
-The main purpose of a coordination service in a distributed system is to maintain shared state or information about the configuration of the cluster, including IP addresses of nodes, partitions, and the status of nodes, as well as facilitating distributed locks and consensus for certain operations.
Can you name two examples of coordination services mentioned in the script?
-Two examples of coordination services mentioned in the script are ZooKeeper and etcd.
What does the term 'highly available' imply for coordination services?
-For coordination services, 'highly available' means that they are replicated key-value stores built on a consensus layer, ensuring that the system remains operational and accessible even if some of the nodes fail.
Why are coordination services generally designed for read-heavy workloads?
-Coordination services are designed for read-heavy workloads because writes require consensus among nodes, which can be slow, whereas reads can be scaled linearly with the number of nodes, making them more efficient for such workloads.
What is the significance of monotonic reads in coordination services?
-Monotonic reads ensure that when a client reads from one replica and then from another, the second read is not more outdated than the first, thus preventing the illusion of time moving backward and maintaining a consistent view of the system state.
What is a 'watch' in the context of coordination services?
-A 'watch' in coordination services is a mechanism that allows a client to attach to a key or file, receiving notifications if the watched item changes before the client's transaction is complete, enabling the client to retry operations or make informed decisions.
How does the sync operation help achieve stronger consistency in coordination services?
-The sync operation writes a command into the replicated log, allowing clients to read only from replicas that have the sync included in their log, ensuring that all subsequent reads are from a point in time after the sync was committed.
What is the potential issue with using quorum reads to achieve strong consistency?
-Quorum reads can face a race condition where the leader has committed a value locally but other replicas have not yet received the commit message, leading to a situation where a quorum read might return an outdated value.
Why might reading from the leader be problematic for coordination services?
-Reading from the leader can be problematic because the leader is already handling all write operations and communicating with other nodes, so additional read requests could overload the leader and slow down the system.
What is the trade-off when choosing to achieve strong consistency in coordination services?
-Achieving strong consistency in coordination services comes at the cost of reduced performance, as it requires more communication and coordination among nodes, which can slow down the system.
How do coordination services differ from gossip protocols in terms of data storage and propagation?
-Coordination services store data in a centralized, replicated location and use a consensus algorithm for data updates, ensuring atomicity and total ordering. In contrast, gossip protocols involve nodes randomly passing information to each other until the information is propagated throughout the system.
Outlines
😀 Introduction to Coordination Services in Distributed Systems
The speaker begins by greeting new subscribers and briefly explains the purpose of the video, which is to discuss coordination services in distributed systems. They mention their plan to cover HBase, a type of database, but first, they need to explain the prerequisite technologies. The speaker then defines a coordination service as a centralized system that maintains shared state and metadata about a cluster, including node IP addresses, partitions, and distributed locks. Examples of coordination services are ZooKeeper and etcd. The video aims to provide a general understanding of how these services work, rather than focusing on specific details of any one service.
🔒 The Role of Coordination Services in Ensuring High Availability
This paragraph delves into the specifics of coordination services, emphasizing their role in maintaining high availability through replication and consensus mechanisms. The speaker explains that these services are key-value stores built on top of a consensus layer, which ensures a replicated log across all nodes. They discuss the trade-off between write performance and consistency, noting that coordination services are optimized for read-heavy workloads and may not provide strong consistency for reads. The concept of monotonic reads is introduced to prevent time-reversal issues in data reading, and the use of watches to ensure predicate validity is also highlighted, which allows clients to be notified of changes to data they are interested in.
🔑 Achieving Strong Consistency in Coordination Services
The speaker explores the methods to achieve strong consistency in coordination services, which is not inherently provided due to performance considerations. Three approaches are discussed: reading from the leader, which is reliable but can overload the leader; using the 'sync' operation, which records a position in the log to ensure all subsequent reads are from up-to-date replicas; and quorum reads, which involve reading from a majority of nodes to get the most current value. The paragraph also addresses the challenges and potential race conditions associated with quorum reads, and how they might be improved with further research. The importance of coordination services in modern data storage systems is reiterated, and the video concludes with a reminder of the trade-offs between consistency and performance.
Mindmap
Keywords
💡Coordination Service
💡Distributed System
💡ZooKeeper
💡Consensus Algorithm
💡High Availability
💡Read Heavy Workloads
💡Eventual Consistency
💡Monotonic Reads
💡Predicate Validity
💡Strong Consistency
💡Quorum Reads
Highlights
Introduction to the concept of coordination services in distributed systems.
Explanation of the need for shared state or information in a cluster of nodes.
Mention of technologies like ZooKeeper and etcd as examples of coordination services.
Discussion on the importance of coordination services for read-heavy workloads.
Clarification on the difference between strong consistency and eventual consistency in coordination services.
Introduction of the term 'monotonic reads' and its significance in coordination services.
Explanation of 'predicate validity' and how coordination services ensure it.
Description of how to attach a 'watch' to a key in coordination services for change notifications.
Three methods to achieve strong consistency in coordination services: reading from the leader, using sync, and quorum reads.
Analysis of the limitations and potential issues with reading from the leader for strong consistency.
Explanation of the 'sync' operation and its role in ensuring strong consistency.
Introduction to the concept of quorum reads and their potential for improving read performance.
Discussion on the challenges and race conditions associated with quorum reads.
Conclusion on the importance of coordination services in modern data storage systems.
Highlight of the trade-off between strong consistency and performance in coordination services.
Emphasis on the role of coordination services in various database technologies and their practical applications.
Transcripts
hey everyone uh wow welcome to all the
new subscribers yet again uh hope you
guys are looking forward to learning
some more crap about systems design uh
today i'm gonna kind of rush this video
through since uh i'm in a little bit of
a rush to get some stuff done but i just
got back from the gym feeling kind of
smelly so it's a perfect time in my
perfect computer science mindset and
mentality to make another video um i
know i've kind of been working up to
talking about more technologies and
stuff like that i think what i'm going
to do is probably hbase next which is
another type of database but in order to
talk about that i have to talk about one
or two different types of technology and
then we'll probably go and get to that
hbase video so enjoy that let's get into
this
okay so a coordination service what is
it
basically in any type of distributed
system that has a cluster of nodes
there's some need to have some sort of
shared state or information about the
configuration of the cluster
that includes things like the ip
addresses of nodes in this cluster which
partitions are on each node the nodes
that are actually still members so which
are alive and which might be down things
like distributed locks that require some
consensus or coordination for one node
to be grabbing them and others can't
so you know obviously there's this kind
of need for a centralized system where
you're keeping information or like
metadata about the cluster so to solve
this problem um things called
coordination services have come up and
if you've ever heard of things like
zookeeper or at cd these are examples of
them and so basically i'm going to talk
about those this video i'm not going to
do like super specific like zookeeper
specific or lcd specific details i might
do that in the future but for now i'm
just going to try and give a sense of
how coordination services in general
work and then we'll
i guess kind of see in the future how
those work inside of other systems
generally speaking by the way these are
meant for read heavy workloads so we'll
see why that's relevant in a bit
okay so what is a coordination service
well i kind of touched upon this in my
raft video but the point is they're just
highly available and by highly available
all that really means is that they're
replicated key value stores or in the
case of zooka zookeeper it's kind of
like almost like a file system but files
can have child files
built on top of some sort of consensus
layer which really just means that they
share a replicated log
by virtue of having to use a consensus
mechanism like i don't know paxos raft
or zookeeper uses something called zab
obviously rights are going to be pretty
slow because they all have to go through
the leader and they all have to touch at
least a quorum of nodes in order to
actually successfully write
in terms of reads and coordination
services well you might think that hey
oh the fact that we're using a consensus
algorithm might mean that we're going to
have strong consistency actually that's
not the case because like i said these
are for read heavy workloads like
probably ten to one is what it says in
the zookeeper documentation so what that
means is that if you're reading ten to
one generally strong consistency isn't
going to be efficient enough you
probably need to deal with eventual
consistency and then you know if you
need to go and make that even more
consistent such that you get strong
consistency zookeeper will allow you to
do something like that but generally
speaking
what do coordination services actually
tell you about their reads because you
can read from any replica and there's no
guarantee that the data is going to be
up to date
well first of all there's no monotonic
reads monotonic reads is a term that i
introduced probably like 15 videos ago
but if you recall a monotonic read is
basically when a client makes a read
from one replica then reads from another
replica and the second one is more
outdated than the first and as a result
it looks like time is going backwards so
obviously we can't be having any of that
we want a service to see reads actually
going forwards in time and so you can
avoid this by using the replicated log
basically the fact that each server has
a replicated log where the ids are the
same for each slot in the log means that
if a client makes a read or a write to
or from some replica it's going to get
the last id that it's seen so you know
it knows the length of the log on that
replica and then from then on it's only
going to accept reads as valid if the
log kind of supporting that read if the
log on the replica where it gets the
read from is more up to date than that
last id that it saw
okay additionally
these coordination services can ensure
predicate validity
so basically what this means is uh
predicate's also another term i've used
in the past so if you want to recall a
predicate is basically just information
that you might read from a database
before making some other read or write
so you say you know for example i'm a
doctor and i want to go off my shift but
i can only do so if there's another
doctor currently in the room
i'm going to read to see if there are
other doctors in the room from my
database table and if there are i can
leave but what if that predicate's no
longer valid it's important to be able
to make sure that you know things are
actually still the case that you thought
they were before making some sort of
change to a system so in coordination
services in particular you can attach
something called a watch to any key that
you read or i guess any file or file
name and what that's going to do is the
replica is going to keep track of that
watch and say okay if this file has
actually been changed before that
transaction is finished up
the client is going to receive some sort
of notification probably through some
stream of events that the file was
changed and that way it can either retry
that like read write operation or it can
just keep that in mind and that's
actually kind of a similar approach to
serializable snapshot isolation which if
you remember is just like the database
will keep track of almost the
dependencies of every single read and
write and if one of those dependencies
becomes outdated it'll have to cancel
the rate and do it again and that's like
an optimistic concurrency control type
of thing
okay so what if we do need strong
consistency because a lot of the times
you know you really don't want to be for
example
say we're talking about the ip addresses
of the nodes maybe it's unacceptable to
you know be sending write requests to
the wrong ip address and just getting no
response perhaps we want strong
consistency and to ensure that one
client can read another client's rights
pretty much instantly well how can we do
this there are three ways in these
coordination services and i will talk
about all three of them the first and
the most simple is just reading from the
leader so obviously that's always doable
but it's got a couple of issues with it
for starters the leader is already
pretty bogged down by the fact that all
the right operations are going through
it and it has to handle the fact that
it's
communicating with all of the other
nodes in the cluster remember that for
every single write you need to hit a
majority of nodes so the leader is
sending out tons of network data
and the fact that now you're going to be
hitting it with reads too is a lot to
handle it's really going to slow things
down
another thing is to use an operation
called sync which i'll touch upon in the
next slide and then this is kind of
more of an upcoming research area but
we'll talk about it a little bit are the
concept of quorum reads
okay so what is sync a client node will
basically go ahead and write the command
sync into the replicated log so sync is
not any sort of key value but it does
take a position in the replicated log
well what does that do as i said every
single time you write and the leader
responds back to you saying hey you just
had a successful write it'll also
respond with the id of what you just
wrote in the log so you know the
position of your write and the log so
once i have the position of that sync
any replica that i'm going to read from
recall because there's
only monotonic reads allowed
it means that
basically you're only going to be
getting reads from replicas that have
that sync included in their log so it's
basically saying
every single time you write a sync i
will now only be reading from this point
onwards
so uh there's that that's one way of
doing things
and then the other way is quorum reads
so in both of these scenarios keep in
mind we're putting a ton of load on the
leader if you're reading from the leader
well obviously you're putting a load on
the leader and if you're using sync it
means you're doing another right which
means that you're putting even more load
on the leader because the leader has to
communicate with all these other
replicas but what if we just wanted to
be able to get up to reads from the
replicas alone well this might actually
be possible so the first thing to recall
is that all of these replicated
consensus algorithms require a quorum of
nodes in order to accept and eventually
commit or write
so what that means is that if i am to
read from a quorum of nodes i should be
able to get an up-to-date
and up-to-date value basically for any
key however that's not necessarily true
so let's look at this following race
condition
imagine i have the three nodes right
here and as you can see the leader is
the first one and then i'm going to be
reading from the other two and the other
two
comprise a quorum because it's two out
of the three nodes so the way that these
consensus algorithms work basically
is let's look at the operation x equals
four
as you can see all of the replicas have
accepted x equals four so the leader
knows it can go ahead and commit x
equals four so what it first does is it
commits x equals four locally and then
it's going to go ahead and say okay guys
i'm now going to tell you to commit them
however this creates a race condition it
means there's a point of time where the
leader has committed x equals four but
the other two replicas have yet to
receive that network call and as a
result they have not committed x equals
four so now if i read from those other
two nodes which do technically comprise
a majority even though eventually they
will have x equals four at the moment it
still looks like x equals three is the
most up-to-date value even though if we
read the leader we would see that's not
the case the one way we can kind of
rectify this is by looking at those
right ahead logs and saying oh i see
that there's another value here in the
right ahead log and you know x equals
three is probably going to be
overwritten so we can retry our quorum
read but there's no guarantee that
retrying the quorum read is going to get
us the most up-to-date value simply by
virtue of the fact that hey maybe the
the leader crashed after it committed
locally and wasn't able to send the
commits out to everyone else maybe the
there's a network partition between them
so the leader can no longer reach these
other nodes so there's just a variety of
issues
in terms of uh you know quorum reads not
being exactly perfect but
um studies have shown or some research
has shown that they can take a
significant amount of load off the
leader and in certain situations greatly
increase read performance
so it is something to i guess consider
in the back your head maybe research
here will improve a little bit and these
quorum reads will get even better
okay in conclusion coordination services
are a really important sub-component of
a ton of modern day data storage systems
unlike gossip protocols which i
introduced in a previous video which
basically have nodes randomly pass
information from one to the other until
it's propagated through the system
coordination services are a way of
storing the data in a centralized
location which is obviously replicated
for fault tolerance and uses a consensus
algorithm to ensure things like
atomicity and a total ordering of the
rights so that things never get
completely out of whack
even though coordination services don't
handle strong consistency right out of
the box because that would be too big of
a performance hit for things like reads
you want to be able to scale reads
linearly with the number of nodes in the
system
there are ways of achieving strongly
consistent reads that are built into the
system still like that sync command you
can always just read from the leader and
again keep in mind that quorum reads
might eventually be a viable thing that
a lot of people end up using however
another thing to note about strong
consistency and this is always the case
is that to achieve strong consistency it
is at the cost of significantly reduced
performance
okay so i hope this makes sense for
coordination services coordination
services or something that we see used
in literally a ton of other database
technologies and so i think it's
important that we introduce them now so
you can see where they come up as useful
later all right have a good one guys
5.0 / 5 (0 votes)