WHATSAPP System Design: Chat Messaging Systems for Interviews

Gaurav Sen
22 Jan 201925:14

Summary

TLDRThis video script offers an in-depth guide on designing a chat-based application like WhatsApp, focusing on core features such as group messaging and read receipts. It delves into system design considerations, including user-to-user messaging, image sharing, and real-time delivery, while emphasizing scalability and the importance of decoupling services for efficiency. The script also touches on advanced topics like consistent hashing and message queues, providing a comprehensive overview for developers interested in building robust chat systems.

Takeaways

  • ๐Ÿ˜€ The video is a tutorial on designing chat-based applications, with a focus on features like WhatsApp's group messaging and read receipts.
  • ๐Ÿ”‘ Starting with simple and familiar features is recommended when discussing system design with an interviewer, as they tend to approve the first feature suggested.
  • ๐Ÿ‘ฅ Group messaging is a core feature, with WhatsApp allowing up to 200 people in a group, highlighting the importance of understanding this feature for system design.
  • ๐Ÿ“ธ Image and video sharing are common features in chat applications, and their implementation can be referenced from other resources like the Tinder video.
  • ๐Ÿ“ฉ Read receipts, indicated by tick marks, are crucial for showing the status of message delivery in chat applications.
  • ๐Ÿ•’ The 'last seen' feature and determining if a user is online or when they were last active is an important aspect of user presence in chat apps.
  • ๐Ÿ—‘๏ธ Temporary chats can provide privacy and save storage space, but permanent storage may be necessary for compliance or official communication.
  • ๐ŸŒ The architecture of chat applications should avoid a single point of failure, using techniques like load balancing and service discovery.
  • ๐Ÿ“ฑ The gateway acts as a bridge between the client application and internal services, translating external protocols to internal communication formats.
  • ๐Ÿ”„ Websockets are preferred for real-time communication in chat apps, allowing peer-to-peer messaging without the limitations of client-server protocols.
  • ๐Ÿ”’ Security and authentication are critical but are often handled at the gateway level, simplifying internal communication protocols.

Q & A

  • What is the main focus of the video on designing WhatsApp?

    -The main focus of the video is to discuss the design of chat-based applications, with a specific emphasis on features like group messaging and read receipts, which are key in system design interviews.

  • Why is it important to start with simple features in a system design interview?

    -Starting with simple features is important because the first feature you propose is often accepted by the interviewer. It's better to start with something you understand well to set a strong foundation for the rest of the interview.

  • What is the maximum number of people allowed in a WhatsApp group?

    -WhatsApp allows a maximum of 200 people in a group chat.

  • What are the tick marks in WhatsApp messages, and what do they signify?

    -The tick marks in WhatsApp indicate the status of a message: a single tick means the message has been sent, a double tick means it has been delivered, and a double tick with a check mark means it has been read.

  • Why is it beneficial to consider chat messages as temporary rather than permanent?

    -Considering chat messages as temporary provides more privacy to users and saves storage space, as the messages are stored only in the user's application. However, for compliance or official communication, messages may need to be stored permanently.

  • What is the role of a gateway in the WhatsApp architecture?

    -A gateway in the WhatsApp architecture serves as the connection point between the user's phone application and the WhatsApp cloud services. It translates external protocols used by the user to internal protocols used by WhatsApp services.

  • What is the purpose of the sessions microservice in the WhatsApp system?

    -The sessions microservice stores information about which user is connected to which box or server in the system. It acts as a router, directing messages to the correct gateway based on the user's connection.

  • Why are web sockets preferred over HTTP for real-time chat applications?

    -Web sockets are preferred for real-time chat applications because they allow peer-to-peer communication without the client-server semantics of HTTP. This enables the server to send messages directly to the client, facilitating real-time communication.

  • How does the system handle the 'last seen' timestamp for users in WhatsApp?

    -The 'last seen' timestamp is updated whenever a user performs an activity, such as sending or reading a message. A separate microservice, the last seen service, tracks user activities and updates the timestamp accordingly.

  • What is the significance of consistent hashing in managing group messaging?

    -Consistent hashing helps distribute group information across multiple servers efficiently, reducing memory footprint and ensuring that the system can handle a large number of groups and users without overloading any single server.

  • Why is message queuing important in the system design of chat applications?

    -Message queuing is important as it ensures that messages are delivered even if there are temporary failures in the system. It allows for configurable retries and delays, enhancing the reliability and robustness of the chat application.

  • What are some strategies to handle high loads during peak messaging times, like New Year's Eve?

    -Strategies include deprioritizing less critical features like read receipts and last seen timestamps, focusing on delivering the message itself. In some cases, systems may even drop or ignore certain non-critical messages to maintain overall performance.

Outlines

00:00

๐Ÿ“ฒ Introduction to WhatsApp Design

The video script introduces the concept of designing a chat-based application like WhatsApp, emphasizing the importance of understanding its core features such as group messaging and read receipts. The speaker suggests starting with simple features and avoiding complex ones during interviews. Key features discussed include group messaging for up to 200 people, image and video sharing, sent, delivered, and read receipts, user online status, and the permanence of chat messages. The script also mentions the importance of considering privacy and storage when designing such applications.

05:04

๐Ÿ”Œ Understanding the Basics of Chat Communication

This paragraph delves into the technical aspects of one-to-one chat communication in WhatsApp. It explains the role of a gateway in connecting users to the cloud service and the need for a user-to-box mapping for maintaining active connections. The script discusses the inefficiency of storing connection information on gateway boxes and the advantages of using a sessions microservice to handle user connection data. It also introduces the concept of using web sockets for real-time peer-to-peer communication and the importance of delivery receipts in chat applications.

10:04

๐Ÿ‘ค Last Seen and Online Status Management

The script addresses the challenges of managing the 'last seen' and online status features at scale. It explains the need for a dedicated table to store last seen timestamps and how user activities should update these timestamps. The importance of distinguishing between user activities and system-generated requests is highlighted, with the latter not affecting the last seen status. The paragraph also discusses the implementation of a 'last seen' microservice to track and update user activity timestamps in real-time.

15:05

๐Ÿค– Service Decoupling and Group Messaging

This section discusses the architectural decisions made to handle group messaging efficiently. It talks about decoupling the group membership information from the session service to a separate group service. The script explains how the session service interacts with the group service to identify all members of a group and then routes messages to them via the appropriate gateways. The importance of limiting group size to manage resource consumption and the use of consistent hashing for efficient routing are also covered.

20:11

๐Ÿ”„ Message Passing and System Resilience

The final paragraph focuses on the importance of efficient message passing and system resilience in chat applications. It introduces the concept of a parser and serializer microservice to convert messages into a format that can be easily processed by internal systems. The script also touches on the use of consistent hashing to reduce memory footprint and the implementation of message queues for reliable message delivery. The importance of idempotency in messaging systems and strategies for handling high-load events, such as prioritizing certain messages during peak times, is also discussed.

Mindmap

Keywords

๐Ÿ’กWhatsApp

WhatsApp is a widely used messaging application that allows users to send text messages, images, videos, and make voice and video calls. In the video, WhatsApp serves as a case study for designing a chat-based application, emphasizing its features like group messaging and read receipts, which are central to the discussion on system design.

๐Ÿ’กGroup Messaging

Group messaging refers to the ability for multiple users to participate in a single chat conversation. In the context of the video, the script discusses the design considerations for implementing group messaging in a chat application, such as WhatsApp, which supports groups of up to 200 people.

๐Ÿ’กRead Receipts

Read receipts are a feature in messaging apps that notify the sender when their message has been read by the recipient. The video script mentions read receipts as a key feature of WhatsApp, illustrating the importance of this functionality in the design of chat applications to provide feedback to users on the status of their messages.

๐Ÿ’กImage Sharing

Image sharing is the ability to send and receive images within a chat application. The script raises the question of whether images should be shared in messages, highlighting it as a common feature in chat applications and a consideration in the design process.

๐Ÿ’กGateway

In the context of the video, a gateway refers to the server that clients connect to in order to send and receive messages. The script explains that the gateway acts as an intermediary between the client application and the internal services of the chat system, handling external protocols and security.

๐Ÿ’กSession Service

The session service is a microservice in the chat application's architecture responsible for tracking which users are connected to which server. The script discusses how this service decouples the user-to-server mapping from the gateway, improving system scalability and efficiency.

๐Ÿ’กWebSockets

WebSockets are a protocol that enables two-way interactive communication between a user's browser and a server. The video script highlights WebSockets as a critical technology for real-time communication in chat applications, allowing servers to push messages to clients instantly.

๐Ÿ’กLast Seen

The 'last seen' feature in chat applications shows when a user was last active or online. The script describes the architectural considerations for implementing this feature, including tracking user activity and updating timestamps to indicate online status.

๐Ÿ’กConsistent Hashing

Consistent hashing is a technique used in distributed systems to distribute data (e.g., chat groups) across multiple servers in a way that minimizes reorganization when servers are added or removed. The script mentions consistent hashing as a method to efficiently route messages to the correct server based on group ID.

๐Ÿ’กMessage Queue

A message queue is a tool used in software architecture to handle messaging and communication between different parts of a system. The script discusses the use of message queues for ensuring messages are delivered reliably, even in the event of service failures, by providing retries and failure notifications.

๐Ÿ’กIdempotency

Idempotency in the context of chat applications refers to the property of a message or operation where it can be processed multiple times without changing the result beyond the initial application. The script touches on the importance of idempotency in maintaining the integrity of the system, especially under high load or failure conditions.

Highlights

Introduction to designing a chat-based application like WhatsApp and the importance of understanding its core features for system design interviews.

Key features of WhatsApp discussed include group messaging and read receipts, which are essential for a chat application.

The strategy for handling interview questions about chat application features, emphasizing starting with familiar features like group messaging.

Limiting group sizes to 200 members to manage complexity and maintain efficient messaging.

The role of image and video sharing in chat applications and their impact on system design.

Explanation of sent, delivered, and read receipts, and how they provide message status updates in chat applications.

The importance of considering user privacy and storage efficiency in chat application design.

Discussion on the temporary nature of chats in applications like Snapchat and WhatsApp versus permanent storage in office applications.

Technical explanation of how messages are sent from one user to another in a one-to-one chat setup.

The concept of a gateway in chat application architecture and its role in connecting users to the cloud service.

Introduction to the sessions microservice, which manages user connections and routing in chat applications.

The use of web sockets for real-time peer-to-peer communication in chat applications.

Mechanism for generating delivery receipts to notify senders when their messages have been received.

Design considerations for tracking user online status and last seen timestamps at scale.

The role of consistent hashing in reducing memory footprint and managing group messaging efficiently.

Challenges and solutions for sending messages to large groups, including the use of message queues and load management.

Strategies for maintaining system performance during high-load events by deprioritizing certain features.

Conclusion summarizing the key components and strategies discussed for designing a robust chat application.

Transcripts

play00:00

Hi everyone. This is GKCS. This is a video on designing WhatsApp.

play00:05

It's a chat based application, so once you know how to design WhatsApp,

play00:08

you will be able to design any chat based application to a large extent.

play00:12

The special things about WhatsApp are that they have group messaging and they

play00:17

have these read receipts.

play00:19

So those are the two key features that people look for in a normal system design

play00:22

interview.

play00:23

But there's also other features that we'll be talking about and we'll be talking

play00:26

about the features that we should probably not take up during an interview and

play00:31

basically choose the kind of things that we are doing so that we can actually

play00:34

finish in the hour that we have. Now,

play00:37

amongst all the features that you can ask your interviewer as to,

play00:40

would you like this? Would you like that?

play00:42

Probably you should start simple and you should start with things that you

play00:45

already know,

play00:46

because I've noticed that the first feature that you ask for the interviewer

play00:49

usually says yes. So one of the things I'm comfortable with is group messaging.

play00:54

So WhatsApp has groups at most, 200 people can enter these groups,

play00:59

and so group messaging is something that I understand to a good extent.

play01:02

Image sharing is another good question to ask as to are images going

play01:07

to be shared in these messages? And almost obvious answer is yes,

play01:11

we will allow image sharing or video sharing. Also

play01:15

a good question, but I mean this is something that if you have used WhatsApp,

play01:20

you'll know about is sent, delivered, and read receipts.

play01:23

So you have those tick marks coming in based on what stage is the message on.

play01:27

The final two things are not critical to an application in terms of features,

play01:31

but it's nice to think of in an engineering way.

play01:33

The first one being that is the person online. And if they're not,

play01:37

then when was the last time that they were seen on the chat? And the second

play01:41

thing is, are the chats temporary or are they permanent?

play01:44

So if you have a look at Snapchat, or even if you have a look at WhatsApp,

play01:48

in a way,

play01:48

they're much more temporary than a lot of the office messaging applications.

play01:53

The reason for this is because you want a lot of privacy.

play01:56

You want to give the user a lot of power. Also,

play01:58

it actually saves a lot of storage space if you think about the chats being

play02:02

stored in the user's applications only.

play02:04

But if there is any sort of compliance that you need or if there is any official

play02:08

communication, then you want that message to be stored somewhere forever.

play02:13

So that's another thing that we'll be asking. Although WhatsApp gives you,

play02:17

so to speak, only temporary chats,

play02:19

if you delete the app and if your friend also deletes the app,

play02:23

those chat messages are lost forever. So one thing I'd like to say is

play02:28

that image sharing has already been taken up on this channel.

play02:31

If you want to have a look at how this is done, have a look at the Tinder video.

play02:37

It explains how images can be stored, retrieved, et cetera, et cetera,

play02:40

in a sensible engineering way.

play02:43

So you're left with four features for this video,

play02:46

and the first one we'll be picking up is group messaging.

play02:49

Before we get to group messaging,

play02:50

we need to first talk about how does one person send a message to another

play02:54

person? So that is one-to-one chat, and that is our requirement,

play02:58

which is 1, 2, 1 chat. Alright,

play03:03

this is what we are coming to, okay, let's take this step by step.

play03:08

A lot of the things that I'll be discussing in this are there in the system

play03:11

design playlist.

play03:12

So have a look at that. When you're looking for things like load balancing,

play03:15

when you're looking for things like messaging cues,

play03:17

I'll be using those things as abstractions,

play03:19

as structures to meet all the features that we have talked about.

play03:23

If you want any detail, then you can always go there.

play03:25

Single point of failure is also something pretty important in the WhatsApp

play03:28

architecture. So have a look at those. Now let's start.

play03:31

You have the application installed in your cell phone.

play03:35

You connect to WhatsApp on the cloud.

play03:37

The place that you're connecting to is called a gateway.

play03:40

The reason for this is because you'll be using an external protocol when you're

play03:44

talking to WhatsApp,

play03:46

but WhatsApp might be talking in a different language with its internal

play03:48

services. Main reason being that you don't need that much security.

play03:52

You don't need those big headers that H T P provides you when you're talking

play03:56

internally because a lot of the security mechanisms are taken care of on the

play03:59

gateway itself. So once you do connect to the gateway,

play04:04

let's assume that you're actually sending a message to person B.

play04:08

So you are person A and you're sending to person B,

play04:12

person A connects to the gateway.

play04:13

The gateway actually needs to send it to person B. Somehow

play04:18

you could store this information as to which users are connected

play04:23

to which box in the gateway itself. In that case,

play04:27

you would need some sort of a user

play04:32

to box mapping. Okay? For the gateway service, which is a microservice itself,

play04:37

it needs to store the information as to this user ID is currently connected to

play04:42

box number two. So if this is box number 1, 2, 3,

play04:46

then there needs to be information saying that B is connected to two and A is

play04:49

connected to one.

play04:51

When you have this kind of information being stored on the boxes itself,

play04:55

it's going to be an expensive thing. Why is it expensive?

play04:58

Because maintaining a connection, a TCP connection itself takes some memory.

play05:03

What you want to do is you want to actually increase the maximum number of

play05:07

connections that you can store in a single box and you don't want that memory to

play05:10

be wasted by keeping information for who is connected to which box.

play05:16

Second thing is this information is being duplicated on all three servers.

play05:21

Either it's being duplicated, there's some caching mechanism,

play05:24

or there is some database which is actually handling this.

play05:27

This is transient information,

play05:29

so there's going to be a lot of updates going on over here and this is not nice.

play05:34

There's a lot of coupling that I can see in this system.

play05:37

So what you want to do is you want to keep a dumb connection.

play05:41

This TCP connection should be dumb in the sense that it just takes information,

play05:44

gives information, it doesn't know what it's doing. Apart from that,

play05:50

the person you want to be asking for when it comes to information on who is

play05:54

connected to which box is a microservice in itself,

play06:00

and this microservice can be the sessions microservice.

play06:04

What does a sessions microservice store? Well, who's connected to which box?

play06:08

Just that information that we were storing over here and was being handled by

play06:12

the gateway has been decoupled from the system and being sent to the sessions

play06:15

Microservice,

play06:16

you can see that there are multiple servers for single point of failure

play06:20

avoidance. Okay?

play06:22

So when a user is sending user A is sending some message,

play06:26

it actually asks for send message With

play06:31

the user ID for B, when the gateway gets this message,

play06:36

it's pretty dumb. It doesn't know what to do,

play06:37

it just sends it to the session service, okay?

play06:40

This session service is indirectly a router. When it gets this message,

play06:44

when it gets this request either of send message to user B,

play06:49

what it does is it figures out where does user B exist,

play06:52

which box is user B connected to?

play06:55

And then routes this message basically sending this message to gateway two to

play06:59

send it back to user B.

play07:03

Now what's happened is A has sent a message to B.

play07:08

Interesting.

play07:09

How can A send a message to B if the server is sending this final bit where the

play07:14

gateway two is sending a message to be. This can't be done using S G T P.

play07:18

It's a server to client protocol.

play07:20

I mean rather it's a client to server protocol. So the client sends requests,

play07:24

the server gives responses.

play07:26

So you cannot send a message from the server to the client.

play07:29

You can only send requests from client to server. There's many ways to

play07:35

get over this using HTTP itself. One of them is long polling. In which case,

play07:40

what happens is every minute or so B can ask for, Hey,

play07:43

are there any new messages for me?

play07:45

And then the gateway or the sessions management service,

play07:48

which are one you would like can send it the message.

play07:51

Of course this is not real time. And if you want something real time,

play07:54

especially for chat applications,

play07:56

which it's very important to have the real time thing.

play07:58

So S TT P is not something that we can use and we need another protocol

play08:05

over T C P.

play08:08

And the thing that we are looking for really are web sockets.

play08:12

So web sockets are super nice when it comes to chat applications.

play08:17

The main reason being that they can allow you peer to peer communication.

play08:22

So A send to B, B send to A, there's no client or server semantics over here.

play08:26

So with that,

play08:27

what happens is literally the server can send a message to the client. B,

play08:32

okay, so we are happy B got the message,

play08:37

what now? Well, B got the message. So that means it has been delivered.

play08:43

At this point, user A should be notified that the message has been delivered.

play08:48

There's one place that I missed out on.

play08:51

When the message actually gets to the gateway and gets to the session service,

play08:53

what it can do is you can send a parallel response to gateway one saying

play08:58

that, okay,

play08:59

I got the message now it's going to be sent to user B when it's possible,

play09:04

let's say a different database for the chat.

play09:07

And because it's stored in the database, it's safe, it's persistent,

play09:11

it'll keep retrying the message till user B gets it.

play09:14

So A is guaranteed that B is going to get the message.

play09:17

So it should get the sent receipt.

play09:21

So just give a response saying that, okay,

play09:23

I got the message gateway one is now going to send the message to user A.

play09:28

So sent is taken care of when this entire flow is completed. When B gets the

play09:33

message for the first time, how deliver, how do we give a delivery receipt?

play09:38

Once you send the message to B and b actually got the message,

play09:41

it should respond.

play09:42

I mean it should again go to gateway two and say that got the message.

play09:46

That's an acknowledgement,

play09:47

a TCP acknowledgement when gateway two gets this message,

play09:51

it sends it again to the session service saying that, Hey,

play09:54

this message was received. So this message was received.

play09:58

The message is going to be containing A two and A from field.

play10:03

Yeah. So the session service, what it can do is, okay,

play10:07

the message has been received by the person who was tagged over here too,

play10:10

which is B.

play10:12

So the person who sent the message from A should

play10:17

get a delivery receipt.

play10:19

And so sessions finds out again where A exists. That is box number one,

play10:25

send a delivery receipt. A gets a delivery receipt.

play10:30

Okay?

play10:31

And of course you can think about how red is going to work.

play10:36

The moment a person opens, the application comes and opens this chat tab.

play10:40

They send a message saying that red and the exact same flow takes care of red

play10:45

also. Alright,

play10:48

so that's a lot to digest if you like.

play10:51

Then you can go through this a little more.

play10:53

This is the very first feature of sending and basically

play10:58

delivering receipts to the sender. Okay?

play11:02

The second feature we are talking about is quite simple.

play11:04

It's about the last scene or is the person online right now at a

play11:08

scale, I mean at huge scale, when there's millions of users,

play11:11

everything gets complicated.

play11:12

But one of the principle architectural things that we can do over here is this.

play11:18

Simply put B just wants to know when A was online the last time this information

play11:23

has to be stored somewhere. And what the server can do is they can ask A,

play11:27

but that would be stupid. So instead, A is not even in the picture now,

play11:32

and the only messages which will be sent and received are from B and the server.

play11:37

So B asks the server, when does a online last?

play11:40

There needs to be some information in some table saying that this

play11:45

user was lost online at this time.

play11:48

So some timestamp and it will have some

play11:53

entry over here with a particular timestamp.

play11:57

The only question which remains is how is this row maintained the last seen

play12:01

timestamp for a particular user? This key value pair,

play12:05

whenever a user A makes does an activity,

play12:08

basically sending a message or reading a message or any kind of request to the

play12:13

server should be logged as an activity and that current timestamp should be

play12:18

persisted in this table. In that way,

play12:20

we can say that whenever A did anything, definitely they were online,

play12:25

which means that the last seen timestamp needs to be updated based on this.

play12:30

B can be told that if A is online or not.

play12:35

One of the key features over here is that if A was online three seconds ago,

play12:39

then B shouldn't be told that they were online three seconds ago. Instead,

play12:44

the showing tag should be online.

play12:48

Probably they haven't done any activity in the last three seconds.

play12:51

You can keep this threshold to anything that you like, maybe 10 seconds,

play12:54

maybe 15 seconds.

play12:56

But the important thing is they're either online or they were last seen at least

play13:00

let's say 20 seconds ago.

play13:02

The last scene tag is a little tricky to update even after taking in all

play13:06

activities.

play13:07

So what I'll be doing is whenever a user sends a request to

play13:12

the gateway, I'll be having a microservice, which is the last seen microservice.

play13:18

And what this will be doing is it's doing user activity tracking.

play13:22

Anytime there's an activity, they definitely send a message to the gateway.

play13:26

When they send a message to the gateway,

play13:28

I'm going to say that they're last seen at this point. Now interestingly,

play13:31

there might be some requests which are not being sent by the user,

play13:34

but by the application itself. For example, when you pull for certain messages,

play13:38

maybe you're offline, you're not using the app,

play13:40

but you want your application to notify you whenever there's a message.

play13:45

So for example, delivery receipt, that's not an activity by me.

play13:49

So the request should be smart in the sense that the client should be smart

play13:52

saying that this is a user activity and this is something that the application

play13:56

itself is doing. So two types of messages being sent by the client.

play14:02

One type is user activities,

play14:04

and the other one is let's say system generated or app messages.

play14:10

App requests. This can be a flag in the request itself.

play14:14

If it's an app request, don't send it to the last scene service.

play14:17

If it's a user activity, send it to the last scene service,

play14:20

it'll go and update the last seen timestamp for this

play14:25

user. And in that way,

play14:27

what can happen is user B can say whether the user is online,

play14:30

or at least they were last seen at this timestamp by querying this service.

play14:36

So feature three is also done.

play14:45

Alright, so we are.

play14:46

Very close to actually completing.

play14:47

This chat messaging application. As you can see,

play14:49

it's a pretty complicated hacker, but we get to everything one by one.

play14:54

Certain things that I like to skip over so to speak,

play14:57

is load balancer because we have already talked about this,

play15:00

so I won't be talking about how the load balancer balances the load

play15:04

across the system.

play15:05

There's one interesting thing which we have not talked about in the CDs,

play15:08

which is service discovery or heartbeat maintenance.

play15:12

And that will be taken in a separate video, but it's pretty interesting.

play15:15

You can have a look at some blogs,

play15:17

I'll probably post them in the description below.

play15:20

The authentication service is another thing that I'll be talking about later.

play15:25

Main reason being that it's quite simple,

play15:28

but it's something worth talking about as a basic principle.

play15:31

So that'll also be taken later. As you can see,

play15:34

these four services are things which are not really relevant to WhatsApp,

play15:38

so to speak. The profile services is a very generic service image services,

play15:42

sending emails and sending SMSs. Okay,

play15:45

then what is quota chat application sending messages.

play15:48

Now you can see that there are five users that are drawn over here.

play15:51

The red guys are in one group, the green guys are in the other group.

play15:54

So whenever a user from the red box sends a message,

play15:58

it should go to all other red boxes. And this is the feature of group messaging.

play16:04

So this red user is connected to gateway one,

play16:08

while we have the other red users connected to gateway two.

play16:11

So let us assume that we send a group message through this user.

play16:17

The problem here is that if the session service stores all the information for

play16:20

all groups,

play16:21

let us say the red group has these three users and they're connected to these

play16:25

three boxes. It's too complicated for the session service to handle. I mean,

play16:30

it's something that you can decouple. So that's what we have done.

play16:34

We have decoupled the information for who is existing in which group in a group

play16:38

service. Now, the session service,

play16:40

when it gets a message from a red user is going to be asking the group

play16:44

service, who are the other group members in this group?

play16:48

The group service can then respond saying 10 members with

play16:53

these user IDs exist in this group.

play16:56

Now the session service runs through its own database.

play16:59

Usually this information is going to be cached as much as possible,

play17:02

but it can figure out where these users are connected to

play17:07

through its database. I mean those 10 users.

play17:09

It had a mapping for user ID to connection.

play17:14

And that connection tells you which box, which gateway it exists in.

play17:17

So with this information,

play17:19

it can then route the messages to each of these users one by one.

play17:24

What if the group has too many members? Too bad?

play17:27

WhatsApp actually gives you a maximum limit of 200.

play17:30

There's a lot of chat applications which try to contain that to 500, 600.

play17:34

Main reason being that you'll be otherwise fanning out the request too much.

play17:38

If you've seen the Instagram design video,

play17:40

what happens in that is also when a celebrity actually posts something,

play17:43

it's effectively sending messages to sometimes millions of people,

play17:47

and that's not practical.

play17:47

So you have to either batch process them or you have to wait for these guys to

play17:51

pull them in a chat application because you want the messages to be real

play17:56

time as much as possible. You can't really have too much of a pull mechanism.

play18:01

Instead, what you do is you limit the number of people in a group.

play18:04

200 is a slightly reasonable number compared to millions. Yeah,

play18:07

it's a very reasonable number. So what we are going be doing is we are going to

play18:11

be limiting the number of users we have to sum number X,

play18:15

and we are going to be assuming that the sessions can handle web sockets

play18:21

sending these messages to the relevant users. Okay?

play18:25

Now let's get into the details of this mechanism. I mean,

play18:27

we have the bare bones thing, how it's going to work,

play18:30

but the details are important.

play18:33

The first thing that I would do in this architecture is because a lot of users

play18:37

are going to be connecting to my gateways.

play18:39

These gateways are going to be starving from memory.

play18:42

That's the reason why we have separated out the session service.

play18:44

That's one good way to reduce memory footprint.

play18:48

The second thing you can do is passing in the message, right?

play18:51

Maybe the message is sent over S G T P, it's a chase on message,

play18:54

so on and so forth. You don't really want to pass the message converted into an

play18:59

object, do some smart things on it,

play19:01

find out whether it has been authenticated or not on the gateway itself,

play19:06

all those responsibilities, as many responsibilities as you can,

play19:08

you want to push away from the gateways because those are web sockets.

play19:11

Those are expensive. Those are actual users connected to your box.

play19:15

So I would send an UNPASSED message to the session service or to anyone I am

play19:20

sending it to.

play19:22

One smart way to actually send an UNPASSED message to any

play19:26

service that you want to is to have this unpassed message

play19:34

go through a passer microservice.

play19:37

You don't really need too many server just too hard enough.

play19:41

So I'll just call it parser and unser

play19:46

microservice.

play19:48

What it's going to be doing is it's going to take the unpassed message

play19:53

and going to be converting it to a sensible message.

play19:55

So if your internal protocol is instead of T

play20:00

P or written something or T C P audio, you have something like thrift,

play20:04

which is used by Facebook internally. So I would say thrift.

play20:10

Then you can pass the message over here itself, right?

play20:15

What is the advantage? Let me just, again, retrade,

play20:18

you get an electronic message over here.

play20:20

You send the electronic message forward,

play20:21

there's no work that you're doing on the gateway itself.

play20:25

This electronic message will be converted to a sensible programming language

play20:30

object by this passer on passer, alright?

play20:34

And that will then route it to the right place. Okay?

play20:39

So that's one way to reduce the memory footprint ratio.

play20:42

What are the other concerns or key areas that we should focus on? Group.

play20:46

ID to user id? And this is a one to many mapping, right?

play20:51

One group can have many user IDs and to reduce a lot of the duplication in

play20:56

information that you have, we go for something called consistent hashing.

play21:00

We should have a look at that.

play21:01

Consistent hashing helps you reduce the memory footprint across servers by

play21:06

delegating only some information to some boxes. Okay?

play21:10

Have a look at the video in case you're not sure what this is.

play21:13

Consistent hashing is going to allow you to actually route the request to the

play21:16

right box. What should be routed on

play21:20

the group id. If you have the request routed on the group id,

play21:24

then it can tell you that for this group,

play21:26

who are the users belonging to this group? Alright?

play21:31

That takes care of the routing mechanism we have

play21:36

in case anytime the group service fails,

play21:40

you send the message to the box, it fail. What do you do?

play21:44

You can retry,

play21:46

but you can only retry if you know what request you needed to sign

play21:50

next. So one of the mechanisms for this is message cues. Yeah,

play21:55

we have discussed this in the playlist so I won't be getting into too much

play21:58

detail,

play21:59

but message cues are nice in the sense that once you give a message to the

play22:03

message queue,

play22:03

it ensures that the message will be sent maybe now maybe 10 seconds later,

play22:07

maybe 15 seconds later. Those are configurable options.

play22:10

And also how many times you're going to retry.

play22:11

All of this is configurable in the message queue.

play22:15

If the message queue fails to send the message, even after five retries,

play22:18

it can tell you that it's failed.

play22:19

You propagate the failure all the way to the client saying that, no,

play22:23

I couldn't send this group message. Okay, that's also fine.

play22:25

But the client needs to be told that it's failed or it's cleared. Interestingly,

play22:29

when the group service gets this message, it can send a response that, yes,

play22:33

I got the message sessions, then sends a response to gateway,

play22:37

and the user who sends the original message, gets a sent tick mark.

play22:42

Group receipts when it comes to delivered or seen

play22:47

is pretty expensive. Main reason being that everyone needs to say, yeah,

play22:52

I got the message, I got the message,

play22:53

and then finally it has to come back to this guy.

play22:55

So we won't be getting into that many chat applications.

play22:57

Actually don't even have that. So it's fine.

play23:01

The final few interesting things when it comes to chat messaging or group

play23:03

messaging especially, is that you need item potency.

play23:07

There's an entire video I made on retrial and item potency. Again,

play23:11

taking the Tinder messaging example,

play23:13

so you can have a look at that for the technical details.

play23:16

This architecture is actually very resilient and as a chat system,

play23:19

it's going to do pretty well.

play23:20

There's some tips and tricks over here that you can get to know only if you have

play23:25

worked on messaging systems. So I'll give you a few examples. For example,

play23:30

I mean,

play23:30

I was just reading this blog that Facebook Messenger does it deprioritizes

play23:35

prioritises messages in case there's a huge event like let's say New Year's or

play23:40

let's say some festival like the Valley in India,

play23:42

there's going to be a lot of messages.

play23:43

Everyone's going to be wishing each other happy. The valley Happy New Year,

play23:46

and that's going to be putting a lot of load on the system.

play23:50

So all the principles of great limiting come in here where you don't take

play23:55

messages, which are very important.

play23:57

Or sometimes you just drop messages instead of dropping. I mean,

play24:00

the best thing to do is to deprioritize messages.

play24:05

Things like last seen can be ignored. The entire feature can be ignored.

play24:10

Has this message been delivered? Has it been received?

play24:12

Those are not as important as actually sending the message to the user.

play24:15

The first thing of the server,

play24:17

getting the message and the acknowledgement. That's all the user needs to know.

play24:21

Okay?

play24:22

That's more important than seeing whether the person has read the message or

play24:25

not.

play24:26

So by deprioritizing unimportant messages,

play24:30

you're actually keeping the system health good and you are performing okay

play24:34

instead of not performing at all. So do check out the course.

play24:38

It's really useful when you are designing systems like these. Of course,

play24:42

this takes care of the last requirement that we had,

play24:45

which was to send group messages. Yeah,

play24:47

that is requirement number one taken care of in the end. Alright,

play24:51

thank you so much for listening.

play24:53

Thank you so much for going through this system design.

play24:56

If you have any doubts or suggestions, you can leave them the comments below.

play24:59

If you liked the video,

play25:00

then hit the like button and if you want for other notifications,

play25:03

then hit the subscribe button and I'll see you next time. Oh,

play25:06

and I'll be posting a poll. So word for what you want to see next time.

play25:11

See ya.

Rate This
โ˜…
โ˜…
โ˜…
โ˜…
โ˜…

5.0 / 5 (0 votes)

Related Tags
WhatsApp DesignChat AppSystem DesignReal-TimeMessagingGroup ChatRead ReceiptsOnline StatusImage SharingVideo SharingEngineering Principles