Latency Numbers Programmer Should Know: Crash Course System Design #1

ByteByteGo

4 Oct 202206:22

Summary

TLDRThis video provides a comprehensive overview of latency numbers in system design, focusing on the relative orders of magnitude rather than exact figures. It explores latency ranges from sub-nanoseconds (e.g., CPU register access) to several seconds (e.g., large network transfers). Key examples include cache access, system calls, disk seek times, and network round trips, illustrating how different system components and operations vary in speed. By understanding these latencies, engineers can optimize system design to avoid bottlenecks and enhance performance.

Takeaways

😀 Latency numbers are essential in system design to understand the relative speed differences between various operations.
😀 Absolute accuracy isn't the goal; it's more important to develop an intuition for the orders of magnitude of latency.
😀 Latency can be categorized into various ranges, from sub-nanoseconds (CPU registers) to seconds (network transfer within the cloud).
😀 Sub-nanosecond range: Accessing CPU registers and modern CPU clock cycles are incredibly fast but limited in quantity.
😀 1 to 10 ns: L1 and L2 cache accesses are fast, and CPU operations like branch mispredict penalties can also occur in this range.
😀 10 to 100 ns: L3 cache access and main memory access fall into this range, with memory access being several hundred times slower than CPU registers.
😀 100 ns to 1 µs: System calls on Linux can take several hundred nanoseconds, while hashing operations like MD5 can take around 200 ns.
😀 1 to 10 µs: Context switching and copying small chunks of data within memory takes a few microseconds.
😀 10 to 100 µs: Operations like network proxy processing and SSD read latency fall into this range, with SSD read taking about 100 µs for an 8KB page.
😀 1 to 10 ms: Inter-zone network round trips in cloud providers, as well as hard disk seek times, are typically in this range.
😀 100 ms to 1 s: Slower operations like bcrypt hashing or TLS handshakes take between 250 ms and 500 ms, depending on the distance.
😀 Operations taking over 1 second are relatively rare, but an example is transferring 1GB over the network, which can take around 10 seconds within the same cloud region.

Q & A

Why is it important to understand relative latency numbers in system design?
-Understanding relative latency numbers helps in system design by providing a sense of the differences in speed between different operations, which is more important than knowing exact values. It allows for better decision-making in optimizing system performance.
How do latency numbers change over time and why are some more stable than others?
-Latency numbers like disk seek times have changed drastically due to technological advancements, while others, like network latency between countries, stay relatively stable because they are constrained by the laws of physics.
What are the key latency ranges covered in the video?
-The video covers latency ranges from sub-nanoseconds to seconds, grouped by orders of magnitude. These include sub-nanoseconds (CPU registers), 1–10 nanoseconds (L1/L2 cache), 10–100 nanoseconds (L3 cache), and ranges up to several seconds for tasks like large file transfers.
What is the typical latency for accessing CPU registers?
-Accessing CPU registers is in the sub-nanosecond range, making it one of the fastest operations in a system.
How much slower is main memory access compared to CPU register access?
-Main memory access on modern CPUs is several hundred times slower than accessing CPU registers, typically falling in the 10 to 100 nanosecond range.
What is the cost of a system call on Linux in terms of latency?
-A simple system call in Linux takes several hundred nanoseconds, just for the trap into the kernel and back. This does not include the execution time of the system calls themselves.
What operations occur in the 1 to 10 microseconds latency range?
-In this range, operations like context switching between Linux threads and copying 64KB from one memory location to another typically occur. These operations are about a thousand times slower than accessing a CPU register.
What is the read latency for an SSD, and where does it fit in the latency ranges?
-The read latency of an SSD is around 100 microseconds for reading an 8KB page, which places it in the 100 to 1000 microseconds (0.1 to 1 millisecond) range.
How does the latency of SSD writes compare to SSD reads?
-SSD write latency is approximately 10 times slower than read latency, typically falling at the higher end of the 100 to 1000 microseconds range.
How long does it take to transfer 1GB of data over the network within the same cloud region?
-Transferring 1GB of data over the network within the same cloud region typically takes about 10 seconds, which is in the range of over a second.