Back-Of-The-Envelope Estimation / Capacity Planning
Summary
TLDRThis video script introduces back-of-the-envelope math as a valuable tool for system design, emphasizing its utility in quickly estimating and sanity-checking designs without the need for extreme precision. It explains how to estimate requests per second and queries per second using metrics like DAU and scaling factors. The script provides an example of calculating Twitter's peak tweet creation rate and offers tips for simplifying calculations with scientific notation. It also demonstrates how to estimate storage requirements for multimedia files in tweets, illustrating the process with made-up numbers. The video encourages a practical approach to system design, focusing on order-of-magnitude accuracy rather than exact figures.
Takeaways
- 📝 Back-of-the-envelope math is a valuable tool for system design, used to quickly sanity-check a design without needing absolute accuracy.
- 🔢 It's sufficient to be within an order of magnitude or two of the actual numbers when using this method for preliminary calculations.
- 💡 This math helps in understanding the scale of infrastructure needed, like the number of web servers or the necessity of database sharding or caching.
- ⚙️ The most common metric to estimate is requests per second at the service level or queries per second at the database level.
- 👥 Daily Active Users (DAU) is a key input, often estimated as a percentage of Monthly Active Users (MAU) if only MAU is available.
- 📈 Usage per DAU must be estimated, taking into account that not all active users will engage in the same way with the service.
- 📊 A scaling factor is necessary to account for peak usage times, which can be significantly higher than average traffic.
- ✂️ Simplifying calculations involves converting large numbers into scientific notation to minimize errors and facilitate quick mental math.
- 🧠 Memorizing certain large number conversions, like 10^12 representing a trillion or a terabyte, can expedite the back-of-the-envelope math process.
- 🗃️ Storage requirements can be estimated by calculating the volume of data generated, the proportion containing multimedia, average file sizes, replication factors, and retention periods.
- 📚 The script emphasizes that precision is less critical than being within an order of magnitude, which is typically adequate for design validation.
Q & A
What is back-of-the-envelope math used for in system design?
-Back-of-the-envelope math is used for quickly sanity-checking a design in system design, where absolute accuracy is not as important as getting within an order of magnitude or two of the actual numbers.
Why is absolute accuracy not crucial when using back-of-the-envelope math?
-Absolute accuracy is not crucial because it's usually sufficient to be within an order of magnitude or two of the actual numbers to make informed decisions about system design.
What are the two key insights gained from the example of web service needing to handle 1M requests per second?
-The two key insights are that a cluster of web servers with a load balancer is needed, and approximately 100 web servers would be required to handle the load.
Why might a single database server be sufficient to handle the load for a while without sharding or caching?
-A single database server might be sufficient if the math shows that it only needs to handle a few queries per second at peak, indicating that it can manage the load without additional optimizations for a while.
What is the most useful metric to estimate when using back-of-the-envelope math for system design?
-The most useful metric to estimate is requests per second at the service level or queries per second at the database level.
How is Daily Active Users (DAU) typically obtained and related to Monthly Active Users (MAU)?
-DAU should be easy to obtain, but if only MAU is available, DAU is estimated as a percentage of MAU.
What is the significance of the usage per DAU estimate in system design calculations?
-The usage per DAU estimate helps determine the percentage of active users who will interact with the service, which is crucial for calculating the load on the system.
Can you explain the concept of a scaling factor in the context of back-of-the-envelope math?
-A scaling factor is used to estimate how much higher the traffic would peak compared to the average, reflecting the potential requests-per-second peak where the design could break.
How is the example of estimating Tweets created per second on Twitter calculated?
-The calculation involves multiplying the number of DAU by the average tweets per DAU, applying a scaling factor for peak times, and then dividing by the number of seconds in a day.
What technique is suggested for simplifying calculations in back-of-the-envelope math?
-Converting all big numbers to scientific notation is suggested, as it simplifies multiplication to addition and division to subtraction.
Why is it important to memorize certain conversions when performing back-of-the-envelope math?
-Memorizing certain conversions, like 10^12 being a trillion or a terabyte, helps in quickly converting and simplifying large numbers during calculations.
How does the example calculate the required storage for multimedia files in tweets?
-The calculation involves estimating the percentage of tweets containing multimedia, the average size of those files, the replication factor, the duration of storage, and then applying the appropriate mathematical operations.
What is the conclusion about back-of-the-envelope math in system design?
-Back-of-the-envelope math is a very useful tool in system design; it's important not to over-index on precision, as being within an order of magnitude is usually enough to inform and validate design decisions.
Outlines
📊 Back-of-the-Envelope Math for System Design
This paragraph introduces the concept of back-of-the-envelope math, a tool used by developers for quick sanity checks in system design. It emphasizes that the goal is not absolute accuracy but rather to be within an order of magnitude. The paragraph provides examples, such as determining the number of web servers needed based on request rates, and the decision-making process regarding database load. It also outlines the importance of estimating requests per second and queries per second, and discusses the common inputs for these calculations, including Daily Active Users (DAU), usage per DAU, and scaling factors for peak traffic. An example calculation for the number of tweets created per second on Twitter is given, illustrating the process of estimation using made-up numbers.
🔢 Simplifying Calculations with Scientific Notation
The second paragraph focuses on the technique of using scientific notation to simplify large number calculations in back-of-the-envelope math. It provides a method for quickly converting large numbers and emphasizes the importance of memorizing certain conversions, such as 10^12 representing a trillion or a terabyte. The paragraph also touches on the practical approach of ignoring the exact byte count in kilobytes for the sake of simplicity in this context. It concludes with an example calculation estimating the storage required for multimedia files in tweets, considering the number of tweets per day, the percentage containing multimedia, average file sizes, replication, and retention period. The summary includes the mathematical process and the resulting storage estimates in petabytes for pictures and videos.
Mindmap
Keywords
💡Back-of-the-envelope math
💡System design
💡Requests per second (RPS)
💡Queries per second
💡Daily Active Users (DAU)
💡Scaling factor
💡Scientific notation
💡Estimation
💡Sharding
💡Caching
💡Storage estimation
Highlights
Back-of-the-envelope math is a useful tool for system design, not requiring absolute accuracy but rather a rough estimate within an order of magnitude.
Experienced developers use this method to quickly sanity-check a design, understanding that precise numbers are less important than the order of magnitude.
An example provided shows how to estimate the number of web servers needed based on request rates, suggesting a need for clustering and load balancing.
Another example explains how to estimate database load and the potential need for sharding or caching based on queries per second.
Requests per second at the service level and queries per second at the database level are identified as the most useful metrics for estimation.
Daily Active Users (DAU) and Monthly Active Users (MAU) are key inputs for estimating service usage, with DAU often estimated as a percentage of MAU.
The usage rate per DAU is essential, with a suggested 10%-25% usage rate for services like Twitter.
A scaling factor is needed to account for peak traffic times, such as during commute hours for services like Google Maps.
An example calculation is provided for estimating the number of Tweets created per second on Twitter, using made-up numbers for illustration.
Techniques for simplifying calculations include converting large numbers into scientific notation to reduce errors and simplify the process.
Grouping powers of ten together and performing simple addition and subtraction instead of complex multiplication and division is recommended.
Memorizing handy conversions, such as 10^12 representing a trillion or a terabyte, is suggested for quick calculations.
An example is given for estimating storage requirements for multimedia files in tweets, considering replication and storage duration.
The example shows calculations for both pictures and videos in tweets, highlighting the difference in storage needs based on file size and popularity.
The conclusion emphasizes the importance of back-of-the-envelope math in system design, advocating for an order-of-magnitude approach over precision.
The video encourages viewers to learn more about system design through books and a weekly newsletter, offering resources for further education.
Transcripts
Back-of-the-envelope math is a very useful tool in our system design toolbox.
In this video, we will go over how and when to use it, and share some tips on using it effectively.
Let’s dive right in.
Experienced developers use back-of-the-envelope math to quickly sanity-check a design.
In these cases, absolute accuracy is not that important.
Usually, it is good enough to get within an order of magnitude or two
of the actual numbers we are looking for.
For example, if the math says at our scale our web service needs to handle 1M requests per second,
and each web server could only handle about 10K requests per second, we learn two things quickly:
One, we learn that we will need to cluster of web servers, with a load balancer in front of them.
Two, we will need about 100 web servers.
Another example is if the math shows that the database needs to handle about 10 queries per
second at peak, it means that a single database server could handle the load
for a while, and there is no need to consider sharding or caching for a while.
Now let’s go over some of the most popular numbers to estimate.
The most useful by far is requests per second
at the service level or queries per second at the database level.
Let’s go over the common inputs in a requests-per-second calculation.
The first input is DAU, or Daily Active Users.
This number should be easy to obtain.
Sometimes, the only available number would be Monthly Active Users.
In that case, estimate the DAU as a percentage of MAU.
The second input is the estimate of the usage per DAU of the service we are designing for.
For example, not everyone active on Twitter makes a post.
Only a percentage does that.
10%-25% seems to be reasonable.
Again, it doesn’t have to be exact.
Getting within an order of magnitude is usually fine.
The third input is a scaling factor.
The usage rate for a service usually has peaks and valleys throughout the day.
We need to estimate how much higher the traffic would peak compared to the average.
This would reflect the estimated requests-per-second peak where
the design could potentially break.
For example, for a service like Google Maps,
the usage rate during commute hours could be 5 times higher than average.
Another example is a ride-sharing service like Uber, where weekend
nights could have twice as many rides as average.
Now, let’s go over an example.
We will estimate the number of Tweets created per second on Twitter.
Note that these numbers are made up, and they are not official numbers from Twitter.
Let’s assume Twitter has 300 million MAU, and 50% of the MAU use Twitter daily.
So that’s 150 million DAU.
Next, we estimate that about 25% of Twitter DAU make tweets.
And each one on average makes 2 tweets.
That is 25% * 2 = 0.5 tweets per DAU.
For the scaling factor, we estimate that most people tweet in the morning
when they get up and can’t wait to share what they dreamed about the night before.
And that spikes the tweet creation traffic to twice the average when the US east coast wakes up.
Now we have enough to calculate the peak tweets created per second.
We have:
150 million DAU times 0.5 tweet per DAU, times 2x scaling factor divided by 86,400 seconds in a day.
That is roughly about 1,500 tweets created per second.
Let’s go over the techniques to simplify the calculations.
First, we convert all big numbers to scientific notation.
Doing the math on really big numbers is very error-prone.
By converting big numbers to scientific notation,
part of the multiplication becomes simple addition, and division becomes subtraction.
In the example above, 150 million DAU becomes 150 times 10 to the sixth
or 1.5 times 10 to the eighth.
There are 86,400 seconds in a day,
we round it up to 100,000 seconds, and that becomes 10 to the fifth seconds.
And since it’s a division, 10 to the fifth 5 becomes 10 to the minus fifth.
Next, we group all the power of tens together, and then all the other numbers together.
So the math becomes:
1.5 times 0.5 times 2
And
10^8 * 10 ^(-5) = 10^(8-5) = 10^3
Putting it all together, it’s 1.5x10^3, or 1,500.
Now with practice, we should be able to convert a large number to scientific notation in seconds.
And here are some handy conversions we should memorize:
As an example, we should know by heart that 10^12 is a trillion or a TB, and when we see a
number like 50TB, we should be able to convert it quickly to 5x10^1x10^12, which is 5x10^13.
We are going to ignore the fact that 1KB is actually 2^10 bytes,
or 1,024 bytes, and not a thousand bytes.
We don’t need that degree of accuracy for back-of-the-envelope math.
Let’s wrap up by going through one last example.
We’ll estimate how much storage is required for storing multimedia files for tweets.
We know from the previous example that there are about 150M tweets per day.
Now we need an estimate on a percentage of tweets that could contain multimedia content,
and how large those files are on average.
With our meticulous research, we estimate that 10% of tweets contain pictures, and they are about
100KB each, and 1% of all the tweets contain videos, and they are 100MB each.
We further assume that the files are replicated, with 3 copies each,
and that Twitter would keep the media for 5 years.
Now here is the math.
For storing pictures, we have the following:
150M tweets x 0.1 in pictures x 100KB per picture x 400 days in a year x 5 years * 3 copies
So, that turns into:
1.510^8 x 10^(-1) x 10^5 x 4x10^2 x 5 x 3
Again, we group the powers of tens together.
This becomes:
1.5 times 4 times 5 times 3,
which is 90
and 10 to the (8-1+5+2), which is 10^14
And that becomes 9x10^15, which is, from the table, 9 PB.
For storing videos, we take yet another shortcut.
Since videos on average are 100MB each while pictures are 100KB,
a video is 1000 times bigger than a picture on average.
Second, only 1% of tweets contain a video, while pictures appear in 10% of all the tweets.
So videos are one-tenth as popular.
Putting the math together, the total video storage is 1000 x 1/10 of picture
storage, which is 100 x 9PB, or 900 PB.
In conclusion, back-of-the-envelope math is a very useful tool in our system design toolbox.
Don’t over-index on precision.
Getting within an order of magnitude is usually enough to inform and validate our design.
If you would like to learn more about system design, check out our books and weekly newsletter.
Please subscribe if you learned something new.
Thank you so much, and we’ll see you next time.
Ver Más Videos Relacionados
Math Antics - Scientific Notation
GCSE Physics - Acceleration #52
Chemistry Lesson: Introduction to Measurements
Calculus - Approximating the instantaneous Rate of Change of a Function
BAB 3 Menentukan Faktor Positif | Matematika Dasar | Alternatifa
Introduction to scientific notation | Pre-Algebra | Khan Academy
5.0 / 5 (0 votes)