What happens when you type google.com into your browser and press enter? (Detailed Analysis)
Summary
TLDRIn this comprehensive video, software engineering enthusiast Hussain delves into the intricate processes occurring when you enter 'google.com' into your browser and press enter. Emphasizing the networking and software engineering perspectives, Hussain breaks down the journey into eight detailed parts, excluding keyboard events and low-level operating system details. Starting with the initial URL input to DNS queries and TCP connections, he explains the protocols, ports, and secure communication steps involved. Highlighting aspects like HTTPS, HSTS, DNS over HTTPS, and the mechanics of TCP/IP communication, Hussain provides a deep dive into the technical underpinnings that make web browsing possible on a brand-new machine with the latest browser versions.
Takeaways
- π» The video explains the complex process that occurs when you type 'google.com' into a browser and press enter, focusing on the networking and software engineering aspects.
- π It breaks down the process into eight detailed components, starting from the initial typing of the URL to the final rendering of the webpage.
- π The first step involves the browser's autocomplete feature that predicts the URL you're typing based on your browsing history.
- π© Determining the protocol and port to connect to is crucial, with the browser defaulting to HTTPS (port 443) for security reasons, unless otherwise specified.
- πΆ DNS resolution is highlighted as a complex step, where the browser must find the IP address associated with 'google.com' to establish a connection.
- π The video covers the transition from unencrypted to encrypted DNS queries, including DNS over HTTPS (DoH) and DNS over TLS (DoT), emphasizing privacy concerns.
- π« Establishing a TCP connection and the TCP three-way handshake are essential steps for setting up a secure communication channel.
- π TLS (Transport Layer Security) setup is a key phase for encrypting data before transmission, ensuring secure communication between the client and server.
- π¨ The HTTP request, including GET requests for webpage resources, is sent over this secure channel, demonstrating how web content is requested and received.
- πβπ¨ The final steps involve parsing and rendering the webpage content in the browser, including executing JavaScript and displaying images and CSS.
Q & A
What is the main focus of the video?
-The main focus of the video is to explain what happens under the hood when you type google.com into your browser and hit enter, with a particular emphasis on networking aspects and software engineering.
Why were certain low-level details, like keyboard events and operating systems, excluded from the discussion?
-These details were excluded because the creator is more interested in the networking and software engineering aspects of the process, and deemed those low-level details as not relevant to the core focus.
How does the browser decide what to do when you start typing 'google.com'?
-The browser first checks your history for pages starting with 'g', showing an autocomplete list based on that. If nothing relevant is found in history, it may check a locally cached index or send a request to a server, depending on the browser's functionality.
What is HSTS, and why is it important?
-HSTS stands for HTTP Strict Transport Security. It is a web security policy mechanism that helps to protect websites against man-in-the-middle attacks by enforcing secure connections, ensuring that browsers only use HTTPS to communicate with the website.
What determines whether the browser uses HTTP or HTTPS to connect to a website?
-If the URL entered doesn't specify a protocol (HTTP or HTTPS), the browser must decide which to use. Modern browsers default to HTTPS for security reasons, often guided by HSTS lists that enforce HTTPS for certain websites.
What is DNS, and why is it a crucial step in accessing a website?
-DNS, or Domain Name System, translates human-friendly domain names (like google.com) into IP addresses that computers use to identify each other on the network. It's crucial because it allows users to access websites using easy-to-remember domain names instead of numerical IP addresses.
What role does TCP/IP play in accessing a website?
-TCP/IP (Transmission Control Protocol/Internet Protocol) is fundamental to the operation of the internet, defining how data should be packeted, addressed, transmitted, routed, and received at the destination. A TCP connection is established between the client and the server for reliable communication.
Why is TLS negotiation important after establishing a TCP connection?
-TLS (Transport Layer Security) negotiation is important because it establishes a secure encrypted connection between the client and the server. This ensures that all data transmitted over the connection is secure from eavesdropping and tampering.
What is the significance of HTTP/2 in web communications?
-HTTP/2 improves the efficiency of web communications by allowing multiple simultaneous requests and responses between the client and server over a single TCP connection. This reduces latency, improves page load times, and enhances the overall user experience.
How do browsers decide to use HTTP/1.1, HTTP/2, or HTTP/3 for a connection?
-Browsers and servers negotiate the protocol version to use during the TLS handshake. The choice depends on the protocol versions supported by both the client and the server, with a preference for the most advanced version supported by both for efficiency and security reasons.
Outlines
π Introduction to the Intricacies of Typing Google.com
The video begins by outlining the intent to explore the detailed processes that occur when google.com is entered into a browser's address bar. Inspired by Alex's GitHub page, which provides a thorough description of the event, the video aims to delve deeper into the networking aspects while omitting low-level details like keyboard events and operating system mechanics. The focus is on the networking and software engineering facets of what transpires behind the scenes. The narrative is set in either 2019 or 2020, using the latest versions of browsers like Chrome and Firefox, and is premised on the scenario of using a brand new machine and browser that has never visited Google before. The video is structured to break down the process into eight parts, promising a comprehensive walkthrough of each step involved in accessing Google's homepage.
π From URL Typing to Protocol Determination
This segment details the initial steps from typing Google.com to determining the protocol and port for connection. It starts with the browser checking the user's history for autocomplete suggestions as soon as the first letter is typed. The video explains how the browser decides whether the input is a URL or a search term, and the subsequent action taken. It then covers the transition to determining the correct protocol (HTTP or HTTPS) and port (80 for HTTP and 443 for HTTPS) to use. The importance of HSTS (HTTP Strict Transport Security) is discussed, explaining how it forces browsers to use HTTPS for sites known to support it, enhancing security and mitigating man-in-the-middle attacks.
π DNS Lookup and Establishing a Connection
This paragraph delves into the complexities of the DNS lookup process and establishing a connection to Google.com. It highlights the role of DNS in translating domain names into IP addresses, explaining the layered approach from browser cache to system files and potentially encrypted DNS queries. The discussion extends to the technicalities of DNS queries, including the use of UDP and the debates surrounding DNS over HTTPS (DoH) versus DNS over TLS (DoT). This step is crucial for initiating a connection by identifying Google's IP address, setting the stage for the subsequent TCP connection establishment and data transmission phases.
π TCP Connection and Initial HTTPS Handshake
This part explains the intricacies of establishing a TCP connection with Google.com and the initial steps towards an HTTPS handshake. It outlines the necessity of a TCP connection for secure communication, detailing the three-way handshake process and the subsequent encryption negotiation. The discussion covers how the browser and server agree on encryption protocols and keys, emphasizing the significance of this phase in ensuring data privacy and integrity over the network. The section serves as a bridge to the comprehensive explanation of the TLS handshake and secure data exchange that follows.
π Advanced Details of TLS Handshake
In this segment, the video provides an in-depth analysis of the TLS handshake process, crucial for establishing a secure connection. It covers the generation and exchange of keys, the negotiation of symmetric encryption for data transfer, and the selection of ciphers. The role of ALPN (Application Layer Protocol Negotiation) and SNI (Server Name Indication) in this process is also highlighted, explaining their importance in determining the specific protocols and domains for the secure session. This detailed exploration showcases the complex mechanisms at play to secure communication between the client and Google.com.
π Sending the HTTP Request and Receiving the Response
This paragraph focuses on the final steps of sending the HTTP GET request to Google.com and processing the received response. It explains how, after establishing a secure TCP connection and negotiating TLS, the browser sends an encrypted HTTP request to the server. The server then responds with the requested web page, which is decrypted by the client. This section also touches on the use of HTTP/2 for efficient communication, describing how it allows multiple requests and responses to be multiplexed over a single connection. The explanation underscores the complexity and efficiency of modern web communication protocols.
π§ Advanced HTTP/2 Features and Security Considerations
This segment elaborates on the nuances of HTTP/2, including server push capabilities and potential security implications. It discusses how servers can preemptively send resources to clients, potentially improving load times. Additionally, it addresses security features like content type headers to prevent MIME sniffing attacks, illustrating how browsers and servers work together to ensure data integrity and privacy. The detailed overview of HTTP/2's features and the emphasis on security measures highlight the ongoing evolution of web standards to enhance performance and safeguard user data.
π₯οΈ Comprehensive Breakdown of Web Page Rendering
The final part of the video script offers a comprehensive breakdown of the web page rendering process once the data is received from Google.com. It delves into how the browser interprets the content type of the received data, parsing HTML, executing JavaScript, and rendering images based on the MIME types. The discussion also revisits the potential for HTTP/2 server push to optimize resource loading and the complexities involved in secure, efficient web communication. This conclusive segment underscores the intricate web of processes that work in concert to display a web page, from initial request to final rendering, highlighting the complexity behind seemingly simple user actions like typing Google.com into a browser.
Mindmap
Keywords
π‘DNS (Domain Name System)
π‘HTTPS (Hypertext Transfer Protocol Secure)
π‘HSTS (HTTP Strict Transport Security)
π‘TCP/IP (Transmission Control Protocol/Internet Protocol)
π‘IP Address
π‘TLS (Transport Layer Security)
π‘HTTP/2
π‘SNI (Server Name Indication)
π‘NAT (Network Address Translation)
π‘SSL Stripping
Highlights
Introduction to the intricate process that unfolds when typing google.com in the browser.
Differentiation between history lookup and autocomplete in the browser.
Explanation of how the browser decides whether input is a URL or search term.
Discussion on protocol determination, emphasizing HTTP vs HTTPS.
Introduction of HSTS to enforce secure connections.
Deep dive into DNS resolution and caching mechanisms.
Exploration of DNS over HTTPS and DNS over TLS controversies.
Explanation of UDP packet handling for DNS queries.
Detailing the process of MAC address resolution and IP packet forwarding.
TCP connection establishment through a three-way handshake.
Discussion on the intricacies of TLS and secure communication setup.
Presentation of HTTP/2 and its benefits for optimizing web traffic.
Handling of HTTP GET requests and server-client communication.
Details on the rendering and parsing process in the browser.
Consideration of HTTP/2 server push and its impact on web performance.
Transcripts
in this video i want to go through what
really happens under the hood when you
type google.com
and you hit enter in your browser this
video is inspired by alex's github page
below i'm going to reference it below it
has a great detailed description of what
really happens when you do that thing
right i did however add more details
like the networking aspects of things
and i also removed stuff that like
keyboard events and low-level operating
systems that i i don't really care about
i i'm really interested in the
networking aspect and the software
engineering aspect of it all right so if
you're interested to know what really
happened when you type google.com and
hit enter stay tuned are you here
welcome my name is hussain and this
channel we discuss all sorts of software
engineering by example so if you want to
become a better software engineer
consider subscribing and hit that bell
icon so you get notified every time i
upload a new software engineering video
with that said let's just jump
into this video
all right i'm gonna break up this video
into eight parts eight components to
talk through right and we're gonna go
through each component one by
one and i am assuming that i'm hitting
google.com it is 2020 or 2019 latest so
it's i'm using the latest chrome version
the latest there is a firefox version
i'm not going to specify which browser
i'm going to use because i'm going to
talk i want to talk through different
browsers technologies and
that is the most important thing this is
a brand new machine assuming this is a
brand new machine this is a brand new
browser i never visited google
ever i this is a brand new machine
nobody ever opened the browser so
google.com will be the first page i ever
visit okay so that's that's the caveat i
want it i want to go that's the context
i want to go through okay that's there
first step initial typing you start
typing g
o o g l e dot com you start typing and
the first letter you type g
what happened is
many things
the browser will either start looking
for your history and pages that start
with the letter g
in your recent visited history and start
showing you an autocomplete list
or
some browser will actually do a search
to an index that is local through this
locally searched index that is cached
some browsers might actually send the
request to a server to this default
search engine
baked into the browser right i'm not
going to go through any of those i'm
going to go through the first step where
you're listing the visited
the history of the pages that you
visited okay let's assume that so you're
getting a list of the visual pages which
is nothing because universe visited any
pages okay so all right so
that's the first step
you finished typing google.com second
step google.com has finished typing in
and you're about to hit enter you didn't
add any http slash you didn't add
anything you just type google.com and
hit enter so the browser does is now it
accepted that that's what alex is start
explaining it's like okay there's a
keyboard event that you're listening to
i don't want to go through this i want
to go through actual networking aspects
and the software engineering high level
stuff right so you hit that now you have
google.com as a string
the browser will stop parsing this thing
and it asks a question
is this a url or is this a search term
all right if it's a search term it
actually does a search and i'm not gonna
go through that okay if it's a url it
visits that page all right it starts the
process to visit the google.com page
okay and we're going through this route
okay we're going to google.com it's a
page i figured out it's a page it's a
website so i want to establish a
connection with that website and i want
to send a get request to that website so
that's the next thing we need to do okay
so step two done okay we know it's a url
we know it's a page
let's go ahead and visit it
third step
determining which protocol
and which port to connect to right
why do we need to know which protocol
well we know it's a page so it's either
http or https so that's the trick is it
http unencrypted port 80 or is it https
encrypted on port 443
because
the user didn't tell us it only he only
or she only told us google.com it didn't
tell us http colon slash slash that
would be easier for the browser right or
it didn't say https colon slash slash
google.com it says just google.com so
the browser has to figure out what's the
protocol okay and by default
prior to certain version
browsers were always going to uh
http let's always assume that it's http
which is unencrypted that causes a lot
of man in the middle attacks
and we we talked about a video called
ssl stripping and hsts i want you to go
and check our video out to learn more
about why it is bad to visit for the
user to visit a as website as a plain
http it's so bad right
even that the web server has actually
supports https right so
this the browser invented concepts
called hsts and we made a video about
that i'm going to reference it below go
ahead and check it out but hsts stands
for http strict transport security and
it's essentially a list that the
browsers keep
cached in it's
in a local database and it has the most
famous
web pages that forces
users or clients to communicate only
through https
so what does what the client does is it
looks through this list and says hey is
google.com an https site or is just a
normal http if if found that in hsts
list then
it uses the https protocol
that means the port will be 443 okay if
it if it's not in the list then it will
be forced to use http which is unsecure
which which means that the port is 80.
okay so that is essentially the step so
i know the protocol now let's assume we
went through the https part okay which
is port 443 and i know it's secure so
now i will only establish a secure
communication to the google.com first
right before i actually establish
i do anything i need to establish a
communication if animatic i'm gonna have
to start adding a lot of f's here right
if google.com was not in the hsts list
then the protocol will be http then the
port will be 80 then the tcp connection
will go through the 80 port which is a
completely different connection okay
we're going through https let's jump
into it step four
dns the most complicated step here okay
here's the thing
dns domain name server okay or systems
i know google.com
i know the port i know the protocol the
port is 443
the protocol https but i don't know the
iep i need to know the ip address in
order to communicate with google.com
right because that's how how tcp works
right everything's through tcp the
network layer i need to know the iep
address and you know the ipads i know i
need to know even something lower than
that called the mac address which we're
going to talk about in a minute
so how do i know the ip address of the
google.com i ask a dns query and here
are the layers of dns right first thing
the browser will check okay google.com
do i have an ip address for google.com
ever right
it's a it's it's in its own cache it has
its own cache of this local dns right
every browser have that it says hey did
i ever visit google.com well no if it
did we're gonna pull up the ipads from
its cache which is very quick
if it doesn't right which which is our
case because we never opened any page
before right it's going to move to the
next thing it's going to ask the
operating system hey
os
do you know this google.com thingy ever
and uh i don't know if you hacked ever
by a windows machine old days in the 90s
there's a host's file we always used to
play with that file and that's
essentially a mapping between a host and
it's ip you can hard code that list in a
host file and we used to do it in all
the time we can we can we can fix the
fix an ip address for a given especially
when we do an online gaming bet on 90 we
want to force an ip address that is
highly available we were doing all this
goofy stuff back then all right so
the host file it looks through the host
file is google.com in the host file is
there an ip address associated with it
well obviously we don't have anything in
the host file so it jumps and here's the
thing
there is something new wish there's a
and there are a lot of drama
people talking about is called
dns over https and there's another thing
called dns over tls okay there's a lot
of drama controversy around this stuff
right some people wants one over the
other here's the thing about dns guys
dns if you don't know is a udp
service uh listening on portfolio 53
okay
and it's unencrypted so anyone can know
which
domains you're going anyone on the
internet if you're using dns right
people know that you're going through
dns okay well
there's a question mark there but sure
okay so dns requests
are visible to your isp so all your isp
your work actually know which page
you're going to you're going to
facebook.com you're going to google.com
but they cannot know these days 2019
2020 they they cannot see what you're
searching for right let's be honest
unless they're using a terminating proxy
a tls proxy terminating proxy that
if they are not
then they cannot see anything except
this thing and people are starting to
solve this problem the dns encrypted
versus unencrypted so how do they solve
it
two technologies were involved
dns over tls so let's establish a tls
connection and do dns over that or let's
do dns over https because
it's just we noticed dps
we can use http 2 because beautiful
bi-directional streaming technology and
we can stream over that okay so we we
can use the existing tech why do we have
to create a custom port for dns right
and there's a file between networking
admins and and and the web
security gurus right
and i kind of leaned towards doh to be
honest
the the admin guys want to know to not
to monitor but because they can't but
they want to see dns requests
they want to differentiate dns requests
from regular network web traffics right
okay and if you're using doh you cannot
do that right you just hide all the dns
requests will become
normal stuff right so long story short
doh right if the browser supports doh
which is dns over https
it will
do that through the dos right the dns
is going to do the dns over https so it
will see what is your default https dns
provider maybe cloud frame maybe google
and it will establish a tls connection
that's a different thing i'm not going
to talk about it right and it's going to
do the dns over there
let's assume it's disabled which is as
if 2019 december
27 today right 27 december 27th 2019
this thing is disabled by default right
a lot of problems right so it's still
controversy right so it's disabled so
let's assume it's disabled on my browser
so i'm not going to do a encrypted dns
so people will see my request so the
final step is to actually do a dns so
what do we do is we're going to do a dns
to find out my ip address you see how
complicated this thing is guys right i
hope you're still watching this video
because
it is is a long process i'm just talking
about and i'm skipping through so much
stuff okay so if i'm connecting to a
if i'm connecting
if i want to know that google.com ipad
is i'm going to establish a udp there's
no udp connection by the way it's just
i'm going to send a udb
datagram user datagram to
the my default dns provider which
usually is configured on my router which
is usually provided my
by my isp which is in this case a
frontier i did change mine to b111
which is the cloud frayer default
dns
right
okay
so my dns is one one one one okay
1.1.1.1 or maybe yours could be google
so a2.8.8.a
okay so you have to know the ip address
of the dns because you want to send a
packet so what do you do
you send a packet right so
let's go through that okay let's go
through the details of how do we send a
a packet
to
1.1.1.1 on port 53. okay
so i am a client right
and let's assume my machine here that's
the first communication with the outside
world here guys right
let's assume my ipad is 1002
and my gateway which is the router is
100
and the dns provider that i want to
communicate with is
1.1.1.1 okay and my mac address is aaa
and
my router mac address is ff right and
that's
all what we need to know so far okay i
want to send a udp request what do we do
we
create an ip packet okay and the iep
packet will have in its layer three will
have the destination ipad is saying
1.1.1.1 okay
and it will have the port 53 and the
source
ip will be 1001 which is me i am the
client and the port the source port will
be a random port let's say three three
three three okay random okay
so now
what we do is
before we send that packet we need to
encapsulate into a frame okay and the
frame is a layer two thingy okay which
needs a mac address right what the heck
is the mac address for 1.1.1.1 so we
asked ourselves this question and he
says well
1.1.1.1 is not in my subnet which is
1001 because i my subnet mask does not
fit this thing right so since it's not
in my subnet i cannot send it locally so
i cannot know it's my address so who the
heck knows the mac address of this thing
i don't know right the answer to that is
the gateway okay if you don't know where
to send it you always send it to the
gateway and my gateway is
10.0.0.1 which is my router right
usually usually it's my router and and i
have like just a plain router in this
case okay sweet
right so
i know that my router mac address is ff
so i'm going to send it to raw my router
my source is a a mac address and i send
it to the router
the router will receive the packet right
and says okay you want to i received the
packet the frame right it's ff but i
look at it and you is
it looks like the client want to go to
1.1.1.1
okay so what do we do how do we send it
to 1.1.1 i'm going to take care of this
i'm going to go through it and do it
exactly the same process is this in my
subnet right
but
i need to do some changing first i'm
going to do a nat because i cannot send
this packet on the internet naked like
that because who the heck knows what the
source
ip10.0.0.2 is because that's my that's
an internal thing so we need to change
it to the public ip of the router which
is i forgot to say but it's 44.1.2.4
so i'm going to change that thing and
i'm going to send it through the wire
and then and then i'm going to use the
same port 3333 and i'm going to add a
nat table this thing network address
translation because i need to remember
it's because it's a very stateful thing
right the whole thing i'm going to add
an entry in my nat table saying that hey
1002 on port 333 is actually going to
one one one one on port 53 and it's
going and i converted it to my powerball
so whenever we give back a response
we're gonna forward this to swizzle it
back and send it back to the client
because that's what we do so we send it
over we communicate with the one one one
one and we get a response okay
we get back a response saying hey
what is google.com
right is google.com we received the ip
errors from google.com and it is
4.1.2. and we receive a response and the
1.1.1 server will actually reply to my
public router
saying that hey this is my response for
the 21.23 is the answer you're looking
for distance to forty four one two four
on port three three three three because
that's the port i am looking for again
the dns doesn't know my client which is
ten zero zero two it knows only the
router the router receives it and says
oh port 333 oh yeah i know where this is
going this is supposed to go to ip
address 1002 because i looked up the nat
table and then it goes back and then
goes to that
and the router
just forwards back the packets right and
it does maybe another arp request and it
sends the information back to the client
okay
now
i know
the ip address
of google.com
how long was that okay that was a long
time all right all right dns done step
four done
now next tcp connection the most
interesting part tcp what do we do with
the tcb guys the tcp connection is
to establish a tcp connection
unlike the udp as there's no it's a
connectionless system i know we made a
video between tcp and udp i'm going to
reference it here but gcp is a
connection system so there's a three-way
handshake that happens and i'm not going
to go through details about this but if
i establish a tcp connection
i need to tell you the ip address where
i'm going which i know now
right it is
4.1.2.3 that's the ip address of google
okay so 4.223 which port i'm going to go
to port 443 because i want to go
securely https
what's my internal ip address it is
10.0.0.2 okay and what's my internal
round random port number that's a
different port because the 333 was
reserved for something else i'm gonna
use
two two two two okay two two two two
four twos
send it okay so
again do the same thing right is 4.1.23
is in mind subnet no it's not so i
cannot send it directly right
i cannot do an arp request on this
address resolution protocol so what do i
do
i need
to send it to who the gateway what's my
gateway mac address it's it's 10 0 0 1
which i did an r before and i found out
it's an ff so i know the mac address of
this thing and i'm going to send that
packet to my router instead my router
receives that thing and it looks at it
and says yeah you want to go to 4.1 due
to three on port 443
and you are 1002 i'm sorry i cannot send
you naked like that i need to change
your source ip address to mine which is
public i know how to talk to the
internet it's very dangerous to go out
there like that so i'm going to change
you to 44.124 which is my public ip
address
and i send
that information and then the port the
internal port is 2222 so i'm sending it
to 44123 and then
we send it over okay
now that's just the one single tcp
connection establishment
the reverse comes back again
right
and then we establish a tcp connection
so let's assume this happened right so
the three-way handshake happened now
we have a full
tcp connection between a client and
google.com which is four one two three
okay and there's a nat table in the
router telling that hey four four three
four one two three
public ipr is four four one two four
which is me on port two two two two is
actually
ten zero zero two which is that client
type address okay now we have a tcp
connection
we did rp within an rp we did a nat
which is a network address translation
there's a thing that can happen here
right let's let's throw a monkey wrench
what if my client has a proxy in it if
it if that client has a proxy
what type of proxy is it a sox proxy is
this an https proxy is this an http
proxy okay
if it's an http proxy nothing changes
because i'm using https still
communicating google directly if i'm
using https proxy then the destination
will be the ip address of the proxy and
instead the ip address of google.com
okay
i'm not gonna go through that path
because that will take me another hour
to explain okay made a lot of videos
about proxies check them out guys
let's throw another monkey wrench let's
assume we're communicating through http
1-1 which is unsecure
which which we are not by the way but
let's assume right so since we assume
since we established one tcp connection
if we already communicated with http 1 1
then we the browser might actually
establish five other tcp connections
because
this is how browsers does pipelining
again something not we're not going to
talk about this so the browser can send
multiple requests at the same time to
multiple tcp connections instead of
waiting right i talked about that in the
http videos go check them out cool
enough monkey wrenches jump to the next
step we have a tcpa connections what's
next i still didn't send a single byte
of data yet guys right i have a tcp
connection of bi-directional between my
client
and the google.com i have it it's nice
it's just swizzled between
many routers like there's like a lot of
nat tables and routers and changing
everything is a stateful thing between
me and the google.com
tls
here's the interesting part the next
step after the tcp connection is
immediately we're going to establish the
tls
connection which is the encryption which
is transport layer security and i made a
video about the ls i'm going to
reference it here if you want to know
the details of it
but here's in a nutshell i'm assuming
that my browser is the latest it's 2020
almost so
i'm using tls 1.3 it will be
embarrassing if google.com doesn't
support ts 1.3
which i'm pretty sure they do okay so
they do even my my my
my site supports ds 1.3 for god's way
okay so
i'm assuming i'm version 1. try 1.3 so
let's just say so it's
this is the latest stuff it's a single
round trip to do everything let's go
through it
okay so version is 1.3 so i'm going to
send the first thing i'm going to send
is
yo
client hello to do the client hello that
first request after the tacp established
is here's the things
i'm going to establish a public key and
a private key
right in my client and i'm going to
merge them
because i'm going to do a diffie-hellman
i'm going to merge these skills through
magic mathematics all right i'm going to
these two numbers that i just generate
the huge prime numbers when merged they
cannot be broken they can they can be
merged right but that's very difficult
to break them okay
that's the first information that we
need to send okay
the second information we need to sing
is the public key itself that we
generated okay so we send public key and
we send the merged information of the
two and we send it but before we send it
we also send some information says hey
server
we're doing this handshake so we can
agree on a symmetric key to encrypt our
stuff right in order to encrypt our
stuff right what do we do
we need to agree on a symmetric key okay
in order to agree on symmetric key we
need to agree on a symmetric key we need
to establish this symmetry key so that's
why i'm doing all that stuff i'm going
to send you this merged
keys
and i'm going to send you the public key
which even if someone sniffed the public
key it's public anyway who cares even if
someone sniffed the merch key they
cannot get anything over because it's
extremely difficult to break those two
numbers okay it's like there's a magic
mathematics that i don't understand okay
and i'm gonna also tell you what ciphers
i support for this symmetrical
information that we can agree on i
support eas i support ds hopefully not
okay it's about blowfish i don't know
what other symmetrical
ciphers are there there's a lot of fancy
stuff
okay es 256 maybe maybe more than that
okay and then i'm gonna send that before
i send that more information do i
support
uh alpn which is the application layer
protocol negotiation do i support
server name indications okay which is
things we talked about before in this
channel okay
and why do we do why do we need the
application layer progression
negotiation because we are cool because
the alpine is the best protocol out
there okay
it
negotiates it tells the client in the
server that hey by the way i'm gonna
about to communicate with you https but
i also support http 2
and i might even support http 3 right
in case of chrome i don't want to throw
another monkey wrench but chrome
communicates with google
in
quick api which is the future http 3 in
the future but i'm not gonna let's not
go there yet okay let's assume i want to
support http 2. so in the same client
hello i'm going to tell you that hey i
support http 2 these are the ciphers
here's my public keys and private keys
and all this stuff and here's the sni
the server name indication because
you
might be a public ip address serving
hundreds of domains right i need to tell
you which
domain i'm actually
communicating with okay and i'm okay
with google.com because your public
ideas which is what 4.1.23
might serve
gmail.com or might serve
mail.google.com isn't that the same
thing i think it's the same thing so i'm
telling you the same thing
put google.com that's my host name
that's the sni okay
send it over http i think that's that's
the whole thing right quick all that
jazz firefox will only communicate i
think with uh h2 right i might be wrong
but if chrome it might actually
communicate with google.com and it's
specific quick protocol which is uh i
think it stands for quick over udp
something like that i forgot what it
means
right
but that's the future http 3 which is
basically
in a nutshell
the http 2 protocol
but
in a udp
connectionless thingy right
so powerful stuff because of tcp because
what we're doing like tcp there's always
a handshake and and three-way handshake
and it's very expensive to to to to do
right so that's why they they want to
minimize these
round trip as much as possible okay
all right we sent the client hello oh my
god we're still in the client hello guys
yeah we the server
right the client hello will be packed
into an ip packet destined to four
one two three port 443 source is
what was the uh port i port two two two
two and this destination the source ip
is ten zero zero two do i ma do an arp
uh i i need to send it to the router
because four four one two is not there
i'm gonna send it to the router do an at
change it i cannot let you go out there
naked let me change your
public address to four four one two
three one two four and then change send
it over and we receive finally
google.com receives the client hello
check that generates the public no
generates the
private its private key
merges it with that merged key so we get
three keys and that three keys the
public
right and the private and the private
makes the symmetric key for the sim
for makes an input
called the hash or whatever it's called
secret right that will go
to
uh to the
decided cipher right so they said okay
let's use you support es you support
blowfish you support all the jazz
symmetric algorithm let's pick aes 256
right i might be wrong i don't know
what's the actual name i'm not a
security engineer right i'm so i'm
software engineer right so we picked
that best algorithm ever hopefully we
didn't get down a downgrade attack in
the middle right so we give that i'm
just i'm securing that i'm gonna use
that input from the three keys to
generate the symmetric key for this es
encryption algorithm and then
i'm gonna tell my server hello
send it's a certificate because now i
know
which host do you want to connect to
from the sni right the server name and
indication so now i know that you want
that certificate for gmail.com or that
certificate for google.com or that
certificate for
i don't know lively.com
lively was a
was it was a site for google back in
early 2000 i remember
okay
i don't know if you guys remember or
google plus right so now i know i'm
gonna serve you the exact certificate
that you actually asked for
server back right here's the
here's my private key merged with the
public key because nobody can break it
send it over
here is my
certificate here is other stuff as well
okay
send it back
to the router because that's the public
ip address that people see which is four
four one two four send it back router
does an at change it back to tinder 0
002 send it back the server hello
receives it and they the client now has
the two private keys one from the server
and one the public there and it has its
own obviously emerge the three together
and then generates the input which now
it knows this agreed about cipher was
eas take the ies and then
what it does is it takes that generates
the symmetric key now both guys have the
symmetric key they can now encrypt right
and they can have live happily ever
after
whoa okay
we have finally
the encryption mechanism everyone
can now start sending data because both
have the symmetric key they can encrypt
and decrypt with the same key because
that's the fastest thing ever guys okay
all right
next step is
almost done
we're going to send this first
http request which is a get request
we're going to send a get request
because now
the enter we're still hitting enter guys
all of this happened while this
single
key hit right we're still not done yet
so we're sending a get request
get slash
take that right
add some headers because we've we're
building an http header right it says
hey i'm visiting git slash the host
header is google.com still we need that
information okay
um
and then uh
we might compress these headers content
type if google.com ever had cookies
before it's gonna start
sending those cookies building those
cookies in the browser and sending them
over with the request right
because assuming we're building the
browser that might change
right if you're actually clicking a link
versus visiting a browser that's a
completely different things right okay
so now
made that get request poof
we have the data we have the headers we
have the body the body is just literally
there's nobody for get requests anyway
so we're not sending anybody we're
sending header we're sending those stuff
and keys perhaps and then oof before we
send
it we agreed by the way on http 2. but i
forgot to mention that in the during the
tls handshake server says yo
you cool you want to
because we did a lpn right and the same
client hello server agreed to http 2.
okay let's assume i'm using firefox not
not chrome okay
and then i agree to http 2 pure http 2.
so now the client says oh oh this guy
wants to communicate http i know http 2.
and if you use using chrome you might
agree you have agreed to using http 2
over quick all right or maybe http 3 if
you're watching this three years from
now okay
so now i got this now i have the http 3
i got all that stuff right
and now
i'm going to commute to hdbc so i i
build this get this get request and then
i
have one tcp connection and i need to
convince http 2. http 2 uses streams so
i'm going to build one stream of data
i'm going to put my headers along and
put my buddy along i don't have anybody
because i'm sending a good request so
it's just just a stream with the headers
i'm going to compress it because hdb2 is
awesome like that that i can present i'm
going to make it into a binary format i
have the piece of data i want to send
next i take my symmetric key encryption
which i did from the tls and i encrypt
that piece of data and i send it across
the binary protocol the beautiful http
across the tcp connection which is what
put the destination ip address at four
one two three but the destination port
is four four three and do that i'll jazz
the exact same thing exact same thing
we're not accepting a new tcp connection
it's the same thing we're just going
through the same route maybe the routes
might change in the in the future but we
don't care
okay so it goes and
goes through that stuff right establish
tcp connections
sweet all right
the
whole packet the stream receives at the
server the server says yo this is this
is get request and now it's up to the
google google might receive that request
and it's a load balancer so it might
switch establish a connection on the
back end if it's a layer 4
if it's a layer 4 load balancer then it
doesn't really establish a tcp
connection it just streams it back to
the destination final back end if it's a
layer seven actually terms tls
i'm not gonna go through that there's so
much work there i'm gonna take me
another two hours to explain that stuff
so i'm gonna terminate that stuff i'm
gonna receive it it says get slash what
do you want for this slash right are
there any rules are there any
index.html pages let's assume there's a
simple index of html pages which has the
google search i don't know how google
works on the backend i've never seen
that
so i'm going to assume there's an
index.html probably not but
yeah let's assume there is something
like that and then we're gonna
start building my headers because the
server now to send the response for that
request right so it's gonna build the
headers and says hey the content type is
actually html uh uh
yeah i want you to set these cookies
because i want i wanna know you i'm
gonna track you
sorry that's how google works we're
gonna track everybody so yeah i know you
i wanna this is the this is how i track
you this is the cookies please set these
cookies on your on your machine please
and then do all that jazz and then
here's the thing this is the html page i
want to uh this is a streaming page
maybe it has a css link javascript some
what else has other
goofy stuff maybe esi who cares right
and then
take the html that's a body right
and then create a stream for the body
create a stream for headers send it over
the same tcp connection destined to my
public ip address of the router four
four one two four
uh i forgot the port was two two two two
yeah two two two two and then send it
back data
before we send it we compress it because
http is cool like that because we know
how to compress thing in http 2.
okay and take that thing
and
we
encrypt it because i have the symmetric
key i forgot the step that was actually
we need we had to decrypt the data
before actually we look at it right and
we can declare it because we have a
symmetric key right i keep forgetting
stuff but you're hopefully you're still
with me guys so i encrypt that stuff
encrypted send it over the network and
once we send it over the network
encrypted nobody can look at it right
and then
it goes to the router router does not
reverse that
send it back to the same machine
my client receives this encrypted
garbage and uses its symmetric key
lock it
unencrypt and unlock it look at the data
a content type is html and here's the
thing okay if it's html the browser
will automatically start parsing it if
the content type is image
then the browser will start to render
this image if the content type is
something else the css that is
javascript the browser will start to
execute that javascript okay that's how
that's how browsers work right there are
some attacks like called mime sniffing
where some servers didn't add this
content type before right so they will
just miss adding it because
uh web administrator back in the 90s or
early 2000s they were very lazy because
you have to go manually and tell okay
this this is actually a picture oh this
is actually an
html oh
we can go only on by the index by the
you're going to go by the extension
because that's not enough right then
extension because you can you can
actually send the responses without
extensions does not really have to be
files on disk for god's sake right
you're sending data back you have to
tell me what kind of data is this
so there was this attack called my
sniffing and we made a video about it
i'm gonna reference it here go check it
out but
browsers
if they don't see the content type they
try browsers try to be
too clever by half and what they do is
actually oh there is no content type
well let me look at the body because
from the body i can actually infer
what's the type so to start parsing the
body and we'll say hey this is html let
me execute it this is a jpeg let me show
it and this caused a lot of attacks back
in the back in the days okay now there's
another header called the ss nef don't
sniff please or whatever of a goal is
this there's a header that tells this
browser do not sniff
it's weird okay anyway let's back back
back back
let's continue all right so we received
that decrypted look at that content type
html yeah let me parse it
before we
reach here
let's add
let's add
let's continue let's continue okay so
html receive it parse it look at it oh
this is html okay let me parse it uh
well there is a javascript file that we
need to download there's a css file
there's a couple of images let's go and
load those so what do we do right so we
turn around and make
additional git requests for those
resources and we're lucky because we're
using http 2
one tcp connection can do the whole
thing for us because the whole thing
will get its own stream stream id4 image
once another stream for image two
another stream for image three another
stream for css another stream for
javascript and send it in parallel
because we are cool like that okay we're
sending everything in parallel server
receives it and then
start sending back the data and you you
get idea there's an encryption
decryption going around and then we get
every file and then the page gets
rendered for us let's throw some
monkey wrenches guys
let's throw the monkey wrenches
let's assume the first get request that
we sent the server
my server supports http 2 push okay
if the server support http 2 push which
i'm not sure google supports it i'm
pretty sure it does but i'm not sure if
it's activated or not because it has its
own problem
right if it does support it before the
html actually gets sent
right back to the client
the brow the server will determine that
hey by the way
you're gonna need
you you're i'm gonna send you index.html
but you're gonna need this file and this
file and this file in this file anyway
so i'm gonna send you
multiple streams back watch out that's
called http 2 push
http h2 push is essentially like
responses for requests that the client
never made okay so
that could be another path that things
can go through okay and essentially
that's that's how how it's done right
final thing
if
we're using http1 the same thing will be
exactly the same there will be no
encryption because http1 doesn't support
encryption wait a second that's wrong
okay yeah
http 1 if it's on https yes it does
support uh it does support encryption
efficiently even what if we're using
http one then that browser will
establish six connections and we'll
start piping those requests into six
connections instead of one so you will
have different internal ports
essentially in your router all right
guys
whoo that was a long video okay and
that's how essentially what happens when
you type google.com and hit enter hope
you enjoyed this video guys right it was
very short i know i know i guys yeah
okay i'm pretty sure i missed a lot of
things i'd love for all of you to type
in the comment section below to let me
know what i missed or and what did i say
wrong if i said anything wrong because i
want to become a better software
engineer that's my goal right and i want
to become better and uh appreciate
everything you guys stay awesome see you
on the next one
Browse More Related Video
Redes de computadores - Protocolo TCP IP - InformΓ‘tica para concursos - Professor Danilo Vilanova
SMT 2-3 Well known Network Services
Common Ports - CompTIA Network+ N10-009 - 1.4
read these 5 books to break into quant trading as a software engineer
Types Of Network Protocol | TCP | IP | UDP | POP | SMTP | FTP | HTTPS |Computer Networks|Simplilearn
3.2.4.6 Packet Tracer - Investigating the TCP IP and OSI Models in Action
5.0 / 5 (0 votes)