Design Youtube - System Design Interview
Summary
TLDRThe video script discusses designing a high-level architecture for a YouTube-like application, focusing on the core functionalities of video uploading and viewing. It highlights the complexity of implementing these features at YouTube's scale, emphasizing the importance of reliability, availability, and minimizing latency. The speaker describes a potential infrastructure involving load balancers, application servers, object storage, and NoSQL databases for metadata. The video also touches on the challenges of video encoding, the use of CDN for optimized video delivery, and the trade-offs between different database systems, sharing YouTube's historical approach to scaling MySQL with the introduction of Vitess.
Takeaways
- 🎯 The core functionalities of YouTube include video uploading and viewing, which require a scalable and reliable architecture.
- 🔄 Dealing with the scale of YouTube involves handling 50 million uploads per day and billions of video views, necessitating a robust infrastructure.
- 🛡️ Reliability is crucial; videos must be stored without risk of corruption or deletion, leveraging object storage solutions like AWS S3 or Google Cloud Storage.
- 🌐 Availability is favored over consistency, meaning it's better to serve slightly stale data than to risk unavailable service.
- 🚀 Video encoding is an asynchronous task that requires a large number of workers to handle the daily upload volume efficiently.
- 💡 Using a CDN (Content Delivery Network) ensures videos are streamed quickly and geographically close to viewers, improving latency.
- 📚 Metadata and user information are stored in a NoSQL database, such as MongoDB, to allow for fast reads and flexible data storage.
- 🔄 Denormalization in NoSQL can improve performance by avoiding joins, but updates to user information may require propagating changes across multiple documents.
- 🚦 Rate limiting may be necessary to prevent abuse of the system, such as uploading an excessive number of videos.
- 🔍 Additional services for recommendations and search would likely be built on top of the core metadata storage, incorporating user history and preferences.
- 🛠️ YouTube's initial use of MySQL and the development of Vitess show that with the right engineering solutions, even relational databases can scale to meet massive demands.
Q & A
What are the core functionalities of YouTube that the design proposal focuses on?
-The design proposal focuses on two main functionalities: uploading videos from a user's perspective and watching videos from a user's perspective.
What is the estimated scale of daily uploads for YouTube?
-The estimated scale of daily uploads for YouTube is 50 million videos per day.
How does YouTube handle the reliability of video storage?
-YouTube uses object storage, such as AWS S3 or Google Cloud Storage, which handles replication and ensures that videos are reliably stored and not subject to deletion or corruption.
What is the read-to-write ratio for YouTube users?
-For every one user uploading a video, there are a hundred users watching videos. This means that for every five users watching a video per day, there are five billion videos being watched per day.
What does YouTube prioritize in terms of data management: availability or consistency?
-YouTube prioritizes availability over consistency. It is more important for the platform to respond correctly and quickly to user requests, even if it means occasionally serving slightly outdated data.
How does YouTube address the latency issue for video playback?
-YouTube addresses latency by using a Content Delivery Network (CDN) to distribute video content geographically close to end users and by streaming videos in small chunks to start playback quickly, even before the entire video is loaded.
What type of database does YouTube initially use for storing video metadata and user information?
-YouTube initially uses a relational database, specifically MySQL, for storing video metadata and user information.
How did YouTube scale their MySQL database to handle a large amount of read traffic?
-YouTube scaled their MySQL database by adding read-only replicas and implementing sharding. They also developed an engine called Vitess to decouple the application layer from the database layer, handling sharding and request routing logic.
What is the role of a message queue in the video uploading process on YouTube?
-The message queue is used to manage the video encoding process, which is an asynchronous task. Videos are added to the queue and then sent to encoding services, which can handle the encoding in parallel.
Why is denormalization acceptable in the context of YouTube's NoSQL database design?
-Denormalization is acceptable because it improves performance by eliminating the need for joins. It allows for duplicate information to be stored, which speeds up read operations, as seen with user profile pictures being stored with each video document.
What protocol does YouTube use for video streaming and why?
-YouTube uses HTTP requests built on top of TCP for video streaming. TCP is favored for its reliability, ensuring that the entire video is received without any missing gaps, which is important for delivering a smooth viewing experience.
Outlines
🎨 High-Level Architecture of YouTube-Style Application
This paragraph introduces the concept of designing a YouTube-style application, highlighting the differences between YouTube and other video platforms like Netflix. It emphasizes the core functionalities of YouTube, such as video uploading and viewing, and acknowledges the complexity behind these features. The speaker also mentions additional features like video search, recommendations, commenting, and analytics, but notes that these cannot be fully explored in a short interview. The focus is on the functional requirements of uploading and watching videos, with a brief mention of non-functional requirements like reliability, scalability, and minimizing latency.
🔧 Reliability and Availability in Video Storage
This paragraph delves into the non-functional requirements of the YouTube-style application, particularly focusing on reliability and availability. The speaker discusses the importance of ensuring videos are not corrupted or deleted, and the challenge of handling thousands of concurrent viewers. The design must account for a large scale, with assumptions of a billion daily active users and a high volume of video uploads and views. The speaker also introduces the concept of favoring availability over consistency, using the example of video recommendations and the potential for temporary stale data.
🚀 High-Level Design and Uploading Videos
The speaker begins to outline a high-level design for the YouTube-style application, starting with the user journey of uploading a video. The paragraph discusses the infrastructure needed to handle the massive scale of video uploads, suggesting the use of a load balancer and application servers. It also covers the storage of raw video files in object storage, like AWS S3 or Google Cloud Storage, and the importance of metadata associated with each video. The speaker chooses a NoSQL database, such as MongoDB, for its flexibility and ability to denormalize data, which is beneficial for read-heavy systems like YouTube.
🎥 Video Encoding and its Asynchronous Nature
This paragraph focuses on the video encoding process, which is an essential and time-consuming part of handling user-uploaded videos. The speaker explains that video encoding is an asynchronous task that requires adding videos to a queue for processing by multiple servers. The paragraph also touches on the need for horizontal scaling of encoding workers to handle the high volume of daily uploads. The speaker uses a hypothetical calculation to illustrate the number of workers needed and emphasizes the importance of having more workers than the number of videos uploaded per second.
🍿 Optimizing Video Viewing Experience
The speaker discusses the optimization of the video viewing experience, starting with an example of how YouTube loads and buffers video chunks for smooth playback. The paragraph explains the technique of loading video segments separately rather than the entire video, which reduces latency and allows for immediate playback. It also covers the use of HTTP requests for video streaming and the separation of audio and video content. The speaker highlights the importance of client-side code for managing memory usage during video playback and touches on the choice between UDP and TCP protocols for video streaming.
🛠️ YouTube's Database Evolution and Use of Vitess
In this paragraph, the speaker provides insights into YouTube's database evolution, noting that YouTube initially used MySQL, a relational database management system, and not NoSQL as might be expected for a system of its scale. The speaker explains how YouTube implemented read-only replicas and sharding to scale MySQL, which led to complex application server code. Eventually, YouTube developed Vitess, an engine to decouple the application layer from the database layer, allowing for easier scaling. Vitess has since been open-sourced and is used by other companies for scaling MySQL. The speaker suggests that while NoSQL might seem like an obvious choice, YouTube's ingenuity in scaling MySQL demonstrates that limitations can be overcome with resourcefulness.
Mindmap
Keywords
💡System Design
💡Video Uploading
💡Video Encoding
💡Load Balancer
💡Object Storage
💡Metadata
💡Message Queue
💡Content Delivery Network (CDN)
💡Cache
💡Asynchronous Task
💡Scalability
Highlights
Designing a YouTube-like application involves addressing the unique challenges of handling user uploads and video streaming at a massive scale.
YouTube differs from other video platforms like Netflix in that it allows users to upload videos and offers free access to a vast library of content.
The core functionalities of YouTube include video uploading and viewing, but also encompass complex systems for recommendations, comments, analytics, and advertising.
Reliability is a critical non-functional requirement, ensuring videos are not corrupted or lost, even when handling a large number of daily uploads.
Availability is prioritized over consistency, meaning users should always be able to access YouTube, even if it means occasionally viewing slightly outdated information.
Latency minimization is essential for a smooth user experience, with videos starting to play as soon as possible after a user clicks on them.
A load balancer is necessary to distribute the massive traffic and video uploads across multiple application servers, ensuring no single point of failure.
Object storage, like AWS S3 or Google Cloud Storage, is used for storing the raw video files due to its efficiency with large files and built-in replication features.
NoSQL databases, such as MongoDB, are chosen for storing video metadata and user information due to their flexibility and performance advantages for read-heavy systems.
Denormalization in NoSQL databases allows for duplicate information to improve read performance, even though it may require additional writes when data changes.
Video encoding is an asynchronous process that requires a message queue and multiple workers to handle the large volume of uploads efficiently.
A CDN (Content Delivery Network) is crucial for distributing encoded videos geographically to reduce latency and improve the viewing experience for users worldwide.
Video streaming, as opposed to downloading, involves sending small chunks of video to the user, allowing for playback to begin without the entire file needing to be loaded.
YouTube initially used MySQL, a relational database, and later developed Vitess to handle sharding and scaling, showcasing innovation in overcoming the limitations of traditional database systems.
The design of YouTube's system involves a balance between read and write operations, with a focus on optimizing for the more frequent and critical read operations.
For video streaming, the HTTP protocol is used, leveraging the reliability of TCP to ensure video chunks are delivered without loss or corruption.
The evolution of YouTube's backend architecture, from MySQL to Vitess, demonstrates the resourcefulness and adaptability required in addressing scaling challenges.
Transcripts
let's design the high-level architecture
of a YouTube style application by the
way this video is taken from my system
design interview course which you can
check out on neetcode.io now let's
design YouTube first let's go over the
background even though I'm sure you're
familiar with how YouTube and other
types of video sites like Netflix and
others work compared to Netflix YouTube
is a bit different and that users can
actually upload videos and it's actually
free as well pretty much anyone can
upload videos and of course if we can
upload videos we can also choose to
watch videos as users and when it comes
to YouTube This is actually the core
functionality though that doesn't mean
it's simple to implement there's a ton
of complexity with reaching the scale
that YouTube does with even just these
two features but this is actually not
all that YouTube is capable of doing
there's a lot of data you can obviously
search for videos you can obviously have
videos recommended and you know
designing that recommendation system
could be its own design problem but even
then and it would not be able to be
fully described and designed in a 45
minute interview of course and users can
of course you know comment and interact
with videos by liking or disliking them
and there's a ton of probably analytics
that goes on with reporting views and
I'm sure there's like bot prevention
with like comments even though there's a
ton of thoughts in the comments lately
and advertising and you know the list
could go on and on the point I'm trying
to make is that when it comes to an
ambiguous kind of design proposal like
this there's many different directions
we could go in and of course we can't
explore all of them so then moving on to
the functional requirements let's say
that the main features that we want to
focus on are going to be uploading
videos from a user's perspective and
then watching videos from a user's
perspective so these are the two main
functional requirements we want to focus
on if we have time at the end maybe we
can kind of explore how we can extend
our design to handle additional
functionality but you know these are the
main things that we want to be able to
focus on when it comes to non-functional
requirements a first one that comes to
mind is reliability when it comes to
videos you would never want to run into
an issue where somebody uploads a video
and then that video is somehow corrupted
or deleted we definitely don't expect
that when we're storing something on
YouTube even though it's free we
wouldn't want a video to just disappear
so we really need the videos to be
extremely reliable at least in terms of
storage and talking about the scale that
we're going to be handling even a single
video can have potentially thousands of
concurrent viewers so that's what we
have to kind of keep in mind of course
we're going to have a ton of users let's
assume that we're designing YouTube to
handle a billion daily active users
which is about accurate I think now when
it comes to these users let's say that
each user is watching five videos per
day but the upload ratio is going to be
a hundred users are watching videos for
every one user uploading a video or you
know this is the ratio of reads versus
rights for videos so if we have five
users watching a video per day we have
five billion of videos being watched per
day if the ratio is a hundred to one
that means one percent of five billion
is going to be the number of videos
uploaded per day so that is going to be
50 million videos uploaded per day so a
massive amount of throughput now the
good thing is among these 50 million
videos most of them probably aren't
going to be getting a ton of views if I
had to guess I bet you know the top five
percent of videos account for like 90 of
the views but this is just off the top
of my head but I think we can kind of
design this in a way that we assume that
you know most videos will not be getting
views though they do still have to be
you know stored and we can't let them
get deleted now in most cases doing a
bunch of complex math isn't super
important it's about coming to the right
conclusions which we kind of are we also
have to keep in mind that when it comes
to availability we definitely want to
favor availability over consistency what
do I mean by that well every time you go
on YouTube and you refresh the YouTube
home page and you want to see like a
bunch of videos on your home page every
time you make that request every time
you refresh you should get a correct
response you should get an HTTP 200
response and things should load and it's
okay if we have to sacrifice consistency
to achieve that what do I mean by that
well what if you're in your subscription
feed and somebody just uploaded a new
video somebody you're subscribed to just
uploaded a new video one second ago and
you just refreshed your home page you
see a bunch of videos in your
subscription feed but none of them
appear to be the one that was just
uploaded a second ago hypothetically
this could happen if we have multiple
storage systems and one of the storage
systems that you happen to be reading
from when you refreshed the page was
this one but this one did not have the
most up-to-date data this one did not
have the new video uploaded but this one
did but eventually that video will be
replicated to the other storage but it
just takes a few seconds so you're
getting stale data our data storage is
not favoring consistency it's favoring
availability and the worst thing that
would happen in this case is most likely
you'd have to wait a little bit longer
maybe when a new video is uploaded you
have to wait five seconds before you can
actually see it or maybe in the worst
case something like 10 seconds but is
that really that big of a deal I think
it would be a lot worse if you refresh
the page and it didn't return anything
to you at all and lastly we want to
obviously minimize the latency as much
as possible when you click to watch a
video ideally it should start playing
immediately even if the entire video
isn't loaded and if we have a good
internet connection we shouldn't have to
experience any buffering or waiting for
the video to load now let's start with
the high level design and I'm going to
start with the user journey of uploading
a video because uploading is probably
going to be more complicated than
actually watching a video and this will
probably give us a better sense of the
infrastructure that's going to be
involved with our design now since we're
dealing with such a massive scale 50
million uploads per day we probably
can't handle that with a single server
or so we would most likely have a load
balancer sitting in between a bunch of
application servers so that we can kind
of scale this horizontally now this is a
pretty generic thing for now let's just
assume that how we kind of do this
doesn't really matter whether you know
the user hits this application server or
this one so for now I'm going to
simplify our design and just kind of
draw it like the user is making an
upload request to the application server
even though under the hood we know it's
of course going to need to be load
balanced now even the act of uploading a
video is not as simple as it might sound
what happens if there's a short like
internet connection breakage like even
for just a second and we're uploading a
file that's like over a gigabyte we were
already halfway through but now would we
have to restart over or could we start
where we left off let's assume that this
is not the direction that we want to go
in and we can just kind of say that once
the video is uploaded it's going to be
stored in some object storage and let's
say that this is where we're going to
store the raw files that the user
uploads but firstly the reason we're
using an object stores because cause
that's a lot better for storing media
and large files like videos we probably
don't need to store that in like a
relational database for example and also
object storage typically you know
something like AWS S3 or Google Cloud
Storage they kind of handle that
replication for us so we can kind of
safely assume that if we store something
in an object store we don't have to
worry about it being deleted and that's
generally how Cloud file storage works
like things like Google Drive are
actually built on top of object storage
so at a high level we can safely assume
that we have our reliability cover now
storing the videos here is fine but what
about the actual metadata associated
with every video going over what the API
for uploading would look like it would
obviously have like a title and a
description and like the actual like
video content itself maybe something
like an mp4 that's what's going to
actually be stored in object storage and
you know there could be a bunch of other
things that we store you know things
like tags and stuff like that but this
isn't really the important part knowing
like every single field that you would
want to store with a video but most
importantly we also want to associate
every single video with a user because
remember every time you you know go on
YouTube and you watch a video underneath
there is usually like the profile
picture and the username of the person
who actually uploaded it this isn't like
Netflix where you just have you know
shows on YouTube people are actually
creating the videos the content creators
so you know every time we want to
actually show a video to a user we're
gonna have to join that video with the
user information and the video metadata
of the video itself and like the person
who created it of course so long story
short every time we upload a video we're
also going to be storing metadata
associated with that video and we're
also going to be storing user
information in this database and I'm
choosing to do a nosql database because
we're going to have so many videos
uploaded probably going to be needing to
read this metadata very frequently in
this database itself we can store a
reference to the video file in the
object store and that should be fine now
let's say for the nosql we're using
something like mongodb which if you
don't recall it it doesn't store things
in tables and rows like a SQL database
it stores things kind of like in a Json
format the terms are a collection we can
have a collection of documents and a
document is pretty similar to a Json
object it's very flexible so let's say
you know one collection is videos every
video document will have all the
information about a single video that we
need and we also have another collection
for a user and you know all their
information you might be thinking if a
user you know wants to watch a certain
video don't we have to then perform a
join with a user well not necessarily
with nosql databases like mongodb we can
have our data a denormalized is the
correct term normalized and SQL is
basically like you don't store duplicate
data you have Separate Tables and then
if you want to aggregate or combine
information you can join those tables
but in mongodb you don't have to do that
we can actually store duplicate
information so in every video we
actually would store the relevant user
information like we know when a user
what goes on YouTube and wants to watch
a YouTube video they kind of see that
profile picture of the user like that's
one example that I'm going to be talking
about right now well that profile
picture is probably also going to be
stored in object storage somewhere so
that profile picture will probably have
a reference to it in the user document
but we'll also actually have it stored
in every video document of you know the
creator of that video so we'll have
duplicate references to it but that's
okay in nosql because it at the very
least does improve a performance we
don't have to perform joins now the
question is what happens if a user
actually updates their profile picture
yes we'd have to update you know the
user document but then we have to update
every single a video document where that
person created a video and maybe they
have 100 videos or maybe they have a
thousand videos we'd have to update all
of those documents and in this case
that's okay because first of all they're
probably not going to be updating their
profile picture very frequently you know
uploading a video is probably more
frequent and watching a video is more
frequent so that's kind of what we're
favoring here reads over writes but also
if they update their profile picture you
know we can kind of update all of those
video documents asynchronously we don't
have to do it immediately is it going to
be the end of the world if somebody sees
an old profile picture from this user
for a few minutes or maybe even an hour
probably not so these are some details
that we could kind of discuss this is
probably not you know high level so
let's kind of continue with the rest of
our design now when it comes to videos
encoding is actually a big part of it as
users upload videos like raw video files
to YouTube YouTube does a lot of video
encoding and compression to get the size
of those videos down and encoding a
video is not something that can happen
like in one second this is definitely an
asynchronous task so it can take on the
order of minutes to typically uh encode
files and if they're you know really
large files I think YouTube will allow
you to even upload like a 24 hour video
file it can probably take hours to do
the encoding for that which is the
reason why we are using a message queue
for that now there's a lot of domain
knowledge that would be needed to
understand and video encoding and that's
not what we really want to dive into so
let's just keep it high level as raw
video files are uploaded we're going to
be storing them but we're also going to
be adding them to a queue so that they
can be sent to another service which is
going to be handling the encoding and
it's probably not going to be you know a
single server that's going to be doing
that we're probably going to have a ton
of servers to do that after the videos
are encoded they are going to be stored
in object storage because you know
there's still videos we probably still
want to store them in object storage to
make sure that they are a reliably
stored and replicated and videos are
immutable so we don't really need like a
Hadoop file system or something like
that object storage is probably good
enough we're not going to be you know
updating a video we'll be updating like
metadata associated with it but you know
with a video we're either gonna upload
it we might delete it but that's pretty
much it you're not going to be editing
the video now this is how a video can be
uploaded but what about actually
watching a video well we want the reads
to be as fast as possible we want the
latency to be as low as possible so
anywhere we can kind of add caching is
going to be really really helpful we
know users aren't going to be reading
you know raw video files they're going
to be reading encoded video files and we
probably want to have these distributed
around the world but also to have the
videos stored as close as possible to
end users we can have a CDN service
which you know does exactly that it
distributes static files geographically
and so when a user wants to watch a
video the video file itself is going to
be loaded via the CDN which is going to
be pulling from the object storage but
the user can fetch like the actual
metadata associated with a video from a
database but to actually speed that up
because probably we know that a small
amount of videos are going to be getting
the most amount of views we can probably
add a ash in front of our database and
that cache of course is going to be an
in-memory cache that's the whole point
of speeding it up because disk is of
course slower than memory but this can
probably not store every video that we
need so we'll have to have you know know
some way to kick videos off most likely
newer videos are going to be getting
more views so we can probably have like
an lru cache implemented here so now
finally let's actually start digging
into some of the details and the first
thing I actually want to talk about is
this encoding part over here more
specifically we talked about we could
have 50 million videos uploaded per day
so my question is how many workers here
assuming that they can actually encode
the videos in parallel which you know
this is a pretty easy service to scale
horizontally at least at the high level
I'm not saying you know video encoding
is an easy a topic to understand but
assuming that at a high level you know
one worker can encode one video at a
time so if one person uploads multiple
videos or you know 10 people are
uploading videos at the same time
they'll be added to the queue and then
they'll reach the encoding service
before they're actually encoded and
written to storage but the point is that
multiple videos can be encoded at the
same time there's no like dependency or
anything like that so if we have 50
million and uploads per day and assuming
that every video takes one minute to
encode which is probably too small it
would probably take longer than that on
average but let's say you know these
workers are really really good they have
really good resources and maybe most
videos that we upload are going to be
pretty short so in terms of capacity
planning how many workers would we need
here well 50 million uploads per day
that's assuming 100 seconds in a day we
can divide that by a hundred thousand I
think we get to roughly 500 videos per
second are going to be uploaded per day
so you know the first thing on your mind
would be well can we just have 500
workers no that's pretty naive because
remember we said that it takes one
minute to upload or to encode every
video on average let's say so if we only
had 500 workers and in the first second
we have 500 videos uploaded okay each of
those workers is encoding a single video
now one more second goes by and we have
500 more videos uploaded but every
worker is busy so we add those 500 to
the queue and then another second goes
by and we add 500 more and you know that
this keeps happening until one minute
has passed and then finally these 500
are done and we can store them and now
the workers can get 500 more videos but
by this point our queue would be
backloaded pretty hard at this rate we
would never get through the backlog so
we need more than 500 workers if you do
the math there's 60 seconds in a minute
so multiply 500 by 60 you'd get to 30k
workers and this is roughly the answer I
personally would be looking for now with
video encoding it's probably pretty hard
to get an accurate estimate and I'm not
sure if you know one worker can actually
handle multiple videos at once maybe
that's the case but the important thing
I would be looking for if I ask you this
is that you know we definitely need more
than 500 workers we need more than how
many videos are going to be uploaded in
a second that's for sure now another
interesting thing about this problem is
actually watching a video let's talk
about some details on how this can be
optimized and the best way to do so is
by looking at an example so right now
I'm on YouTube on my channel
specifically I'm going to go ahead and
open up our Dev tools and we're going to
be focusing on the network tab I'm also
going to filter this on xhrs and I'm
going to click one of these shorter
videos You'll see why in just a second
so first thing you see here is this is
how much of the video has buffered you
can see this portion of the video has
buffered when we watch a YouTube video
we don't need to wait for the entire
video to download before we watch it
we're going to be starting at the
beginning presumably we only need the
beginning to be loaded but watch what
happens when I click over here if I skip
to this part of the video we would well
it just kind of loaded a little bit so
now I'm going to skip over here Watch
What Happens see it kind of immediately
buffers so that's what we want to do we
don't need the entire video to be loaded
but it's true that some people might be
skipping around they might skip around
to this part which seems to be popular
and that you know this part of the video
has not loaded only this part has so
what's gonna happen when I click here
well that part got loaded and what's
actually happening here if we scroll
down in the request the most recent
request is a request to actually load
that portion of the video we are not
using a streaming to do this we're
actually making HTTP requests to load
chunks of the video I'm going to kind of
expand this here you can see a request
was made here and what the response was
looks like gibberish to us because this
is actually you know that portion of the
video and going back to the headers when
you scroll down to the response headers
you can see that okay actually this was
not the video this content type is audio
so I'm going to hit the second one over
here actually and scrolling down to the
header and then looking at the content
type we see that this one was the video
so actually it looks like the audio is
being fetched separately from the video
so this when we look at the response is
probably the video before we were
probably looking at the audio not that
it looks any different to us and I'm
going to go ahead and refresh this and
do it one more time so we can see
pausing this a portion of the video has
has loaded here so we're going to scroll
down to the requests we can see the
video playback requests are the ones
that are actually loading the video
itself and as I click here new chunk of
video was loaded so let's scroll all the
way down to see that one over here these
uh multiple requests and you can see
that some of them are larger than others
but the point is that one megabyte of
data is easier to transfer than the
entire video which might be I don't know
like 20 or 30 megabytes and this is the
technique to lower latency loading a
video via smaller chunks now while
rendering and loading videos is also a
domain knowledge heavy topic I still
think it's worth mentioning because that
technique that we kind of just went over
small chunks of videos It's a pretty
simple concept to understand at the high
level we don't need to send the entire
video to the user before they can start
watching it we can just send them small
chunks of the portion that they're
actually watching now another relevant
question would be what protocol should
we use for sending videos and by the way
what we just talked about is called
video streaming not necessarily live
streaming because we know that video is
already stored it's not like a live feed
but the video is being streamed meaning
it's being sent in small chunks versus
like when you actually download a video
that's not streaming that's like taking
the entire file that's stored and then
sending it to your computer and then
storing it whereas video streaming I
believe those small chunks are actually
stored in your computer's memory which
is also why you would not want the
entire video to be taking up all of your
memory so most likely there is some
client-side code that is handling that
and freeing memory because it's pretty
easy to write client-side JavaScript
that will crash your browser and take up
all your memory so that's kind of
something it's a front-end developer you
might want to keep in mind because if we
were watching like a 10 hour long video
which definitely exists on YouTube we
would not need the entire video to be
buffered in our memory we could just
skip around the video but going back to
what protocol we might want to use for
this since we want latency to be as low
as as possible you might favor at a high
level there's you know two protocols UDP
and TCP you might favor UDP for video
streaming now that's probably a better
choice for video live streaming because
you know as like a sports game is going
on if you miss one second of it you
don't want to go back to that one second
you want to keep up with the most
up-to-date information so you want to
say you know what's happening in real
time that's what you would want if
you're live streaming something or
watching a live stream that's what UDP
favors but with an actual video we know
that video is stored somewhere and we
want to watch the entire video if you're
watching like a movie or something on
YouTube you don't want like to miss you
know two seconds of it because that
might be like the actual plot point so
TCP is favored for reliability that will
ensure that we get the entire video
there's not going to be any missing gaps
in the video and sure it might take
longer to do that but as long as we send
it in small chunks it should be okay and
that's exactly what we saw was happening
with YouTube it was sending HTTP
requests which are Bill built on top of
TCP so I think that is kind of also
another important question in the
context of YouTube compared to a lot of
like other system design problems and
also there's a lot of other things we
could explore with this especially when
it comes to uploading videos keeping
things at a high level we'd probably
want to rate limit this we don't want
somebody to just be able to upload like
an infinite number of videos or you know
that could be implemented in the load
balancer itself which we kind of like
emitted from this design but we know it
does exist also when it comes to
recommendations for YouTube videos or
even searching we'd probably want to
have you know other auxiliary Services
which read from our metadata and we
probably want to store like a history of
what types of videos does this person
watch what types of videos do they like
so we can kind of build some
recommendation for them and for you know
searching videos that could be its own
topic like that's kind of like designing
Google search because there's a lot of
like indexing you can do you probably do
want to incorporate recommendations with
searching as well but also you know you
want to have like the metadata like the
description the title how many views
does a video have which videos are most
relevant when searching which types of
strings we could have some autocomplete
with that and those most likely would be
built on top of this or those would be
built separately from this kind of core
functionality now one last thing I
wanted to cover because I think it's
always interesting to understand how
this type of service was actually built
what YouTube actually did was not use a
nosql database they actually used my
sequel which is a relational database
management system now you might be
wondering why didn't they use nosql and
I definitely don't know the details one
guess I have is that YouTube was
actually first created I think in like
the early 2000s maybe 2004 or 2005 and
mongodb did not exist at that point and
they probably didn't need to handle the
same scale that they do right now but as
time went on they found that they did
need to scale their database I think
what they first did was added read-only
replicas because of course this is a
read heavy system So reading is going to
be more common than actually you know
uploading new videos but even then they
ran into issues and next they tried to
add sharding and so they sharded their
mySQL database and they ended up having
a lot of complex code in their
application server which properly routed
uh user requests to the correct Shard
I'm not exactly sure which Shard key
that they used but that's what they did
and then eventually the long-term
solution they found was by building a
new engine it's called vitess I'm
actually not sure how it's pronounced
maybe the test but this was something
that was created at YouTube and this is
basically to decouple the application
layer from the database layer the
application layer should not have to
know about how the database is sharded
so vitess was added like as a middle
layer in between the application servers
and the database at least at a high
level and that is kind of where all the
logic for sharding and routing the
requests correctly lives and this is
kind of how they were able to take even
a relational database like MySQL and
scale it up now maybe if they could go
back in time they would have started
with a nosql database in the first place
or maybe you know some other type of
database but they did find a way to get
my sequel to work and actually vitess
was later open source and it's actually
a very popular project that's still
being used it's very modern it's very
very powerful it's being used by new
companies like Planet scale which are
taking you know my Sequel and then
adding my test to it and just you know
selling that as a product and of course
adding more functionality but this kind
of shows you when you reach problems in
distributed systems that can kind of
breed a lot of Ingenuity and
resourcefulness and you can kind of
overcome a lot of limitations that you
know we would look at and say oh MySQL
if we're dealing with a lot of read
scale and we don't need an eventual
consistency is fine we can just use our
nosql database but they found MySQL to
work and if you found that interesting
you can kind of read a brief history of
YouTube and my secret cool and the test
in like the vitess docs here and
probably other places on the internet
Weitere verwandte Videos ansehen
Google SWE teaches systems design | EP21: Hadoop File System Design
What is a server? Types of Servers? Virtual server vs Physical server 🖥️🌐
Which Database Model to Choose?
WHATSAPP System Design: Chat Messaging Systems for Interviews
The Problem With UUIDs
Google SWE teaches systems design | EP26: Redis and Memcached Explained (While Drunk?)
5.0 / 5 (0 votes)