API GATEWAY and Microservices Architecture | How API Gateway act as a Single Entry Point?
Summary
TLDRThe video script discusses the critical role of API Gateways in handling high volumes of requests and their differences from load balancers. It explains how API Gateways route requests to appropriate microservices, perform API composition, authentication, and rate limiting. The script also covers service discovery, request/response transformation, and caching. It further elaborates on how API Gateways scale across multiple regions and availability zones, ensuring no single point of failure, and how DNS-based load balancing helps distribute traffic efficiently.
Takeaways
- 🚪 **API Gateway as a Single Entry Point**: API Gateway acts as the single entry point for all client API requests, routing them to the correct backend service based on the API endpoint.
- 🔄 **Difference from Load Balancer**: Unlike load balancers that distribute traffic across instances of the same microservice, API Gateway understands and routes API requests to different microservices based on the endpoint.
- 🤖 **API Composition**: API Gateway can compose responses by calling multiple microservices and aggregating the data, reducing the complexity on the client side.
- 🔐 **Authentication**: API Gateway provides authentication services, integrating with authorization servers to validate client tokens before allowing access to microservices.
- 🚦 **Rate Limiting and Throttling**: API Gateway can enforce rate limits and throttling to manage traffic and prevent abuse, ensuring fair usage and system stability.
- 🔍 **Service Discovery**: It interacts with service discovery systems to find the correct microservice instances to route requests to, as microservice locations can change due to scaling.
- 🔄 **Request/Response Transformation**: API Gateway can transform incoming requests and outgoing responses to fit the needs of the system, including caching responses for efficiency.
- 🌐 **Scalability Across Regions and AZs**: API Gateway scales across multiple regions and availability zones to handle high traffic and provide low latency, ensuring high availability.
- 🌐 **DNS-Based Load Balancing**: At the highest level, DNS-based load balancers distribute traffic to the appropriate regional API Gateways, considering factors like latency and compliance.
- 🛡️ **No Single Point of Failure**: The system design ensures that there are no single points of failure, with traffic able to be rerouted in case of outages in availability zones or regions.
Q & A
What is an API Gateway and how does it differ from a load balancer?
-An API Gateway is a single entry point that accepts client API requests and routes them to the correct backend service based on the API endpoint. Unlike a load balancer, which distributes traffic to multiple instances of a microservice, an API Gateway understands the API structure and makes routing decisions accordingly.
What is API composition and why is it important?
-API composition is a feature of API Gateways that simplifies client requests by allowing the Gateway to call multiple microservices and aggregate the results into a single response. This reduces the complexity on the client-side and is frequently used in services like Netflix to tailor API responses based on the device type.
How does the authentication feature of an API Gateway work?
-The API Gateway can authenticate clients by integrating with an authorization server. Clients pass an access token obtained from the authorization server with their requests. The API Gateway validates this token, and if it's valid, the request is allowed to proceed to the microservices.
Can you explain rate limiting and API throttling in the context of an API Gateway?
-Rate limiting is a feature that sets rules to manage the maximum number of concurrent requests an API Gateway can handle before returning a 429 error. API throttling is a more granular level of control that can limit the request rate for individual users or applications, blocking them once they exceed the allowed rate.
What is service discovery and how does it interact with an API Gateway?
-Service discovery is a system that keeps track of the location (IP address and port) of microservices as they scale up or down. The API Gateway uses service discovery to find the current location of a microservice before invoking it, ensuring that the request is routed to the correct instance.
How does an API Gateway handle millions of requests per second?
-An API Gateway can handle high volumes of requests by being deployed in multiple regions and availability zones, with each region having multiple instances of the Gateway. This distributed architecture ensures that there is no single point of failure and allows the Gateway to scale horizontally to meet demand.
What is the role of DNS in distributing traffic to different API Gateways?
-DNS plays a crucial role in load balancing at the domain name level. Services like AWS Route 53 or Azure Traffic Manager act as DNS-based load balancers, distributing traffic to the appropriate API Gateway based on factors like latency, compliance, and geographical location.
How does the API Gateway decide which microservice to route a request to?
-The API Gateway decides which microservice to route a request to based on the API endpoint. It examines the structure of the incoming API request and uses this information to determine the appropriate backend service to handle the request.
What are the benefits of using an API Gateway over a traditional load balancer?
-API Gateways offer intelligent routing based on API structure, support for API composition to aggregate data from multiple services, authentication, rate limiting, and service discovery. These features provide more flexibility and control over API traffic compared to traditional load balancers, which only distribute traffic without understanding the API context.
Can you provide an example of how API composition might work in a real-world scenario?
-In an e-commerce platform, when a user requests their order history, the API Gateway can compose a response by calling the product and invoice microservices to fetch relevant details. If the request comes from a mobile device, it might show only product and invoice details, while a request from a PC might include additional information like ratings, reviews, and recommendations, all aggregated into a single response by the API Gateway.
What are some other capabilities of API Gateways mentioned in the script?
-Besides routing, API composition, authentication, rate limiting, and service discovery, API Gateways can also perform request and response transformation, caching of responses to reduce load, and logging for monitoring and debugging purposes.
Outlines
🚀 Introduction to API Gateway
The speaker, Shan, introduces the topic of API Gateway in the context of system design. He explains that API Gateway is a critical component that acts as a single entry point for all client API requests. It routes these requests to the appropriate backend service based on the API endpoint. This is different from a load balancer, which simply distributes traffic across instances of a microservice without understanding the API itself. API Gateway is described as intelligent, capable of handling more complex tasks such as API composition, where it can assemble responses from multiple microservices based on the client's device type, thus reducing client-side complexity.
🔐 API Gateway Features: Authentication and Rate Limiting
Shan discusses additional features of API Gateway, including authentication and rate limiting. Authentication is handled by the API Gateway, which can validate access tokens from an authorization server, ensuring that only authenticated clients can access the microservices. Rate limiting is another feature that prevents abuse by setting limits on the number of requests a client can make within a certain time frame. This can be applied globally or to specific APIs or users, helping to manage traffic and prevent overloading of services.
🌐 Service Discovery and Handling High Traffic
The paragraph delves into service discovery, which is essential for managing the locations of microservices as they scale up or down. Service discovery can register or deregister microservice instances or perform health checks to ensure only active instances are considered. This helps API Gateway to route requests to the correct microservice instances. Shan also begins to explain how API Gateway can handle high traffic by discussing the distribution of requests across multiple regions and availability zones, ensuring no single point of failure.
🌆 Regional Distribution and Redundancy
Shan elaborates on the concept of regions and availability zones in cloud infrastructure. He explains that each availability zone within a region has its own data center, and if one goes down, the others can handle the traffic. This setup ensures redundancy and high availability. API Gateway operates at the regional level, routing traffic to the appropriate availability zone based on factors like user location and service requirements. This approach helps distribute the load and avoid a single point of failure.
🌐 Global Traffic Management and DNS Load Balancing
In the final paragraph, Shan addresses how traffic is distributed globally across different regions and API Gateway instances. He introduces the concept of DNS-based load balancing, which directs traffic to the appropriate API Gateway based on factors like latency and compliance requirements. This ensures that users are served by the closest and most suitable API Gateway, improving performance and reliability. Shan also mentions that DNS is not a single point of failure due to its distributed nature, which will be covered in more detail in a future video.
Mindmap
Keywords
💡API Gateway
💡Load Balancer
💡API Composition
💡Authentication
💡Rate Limiting
💡Service Discovery
💡Request/Response Transformation
💡Caching
💡DNS-based Load Balancer
💡Regions and Availability Zones
Highlights
API Gateway is a critical component in system design, serving as a single entry point for client API requests.
API Gateway routes API requests to the correct backend service based on the endpoint.
Load balancers distribute traffic to multiple instances of a microservice, but do not understand API structure.
API Gateway can compose APIs, reducing client-side complexity by handling multiple microservice calls.
API Gateways handle authentication, integrating with authorization servers to validate client tokens.
Rate limiting and API throttling are used to manage traffic and prevent abuse.
Service Discovery is crucial for keeping track of microservice locations in a dynamic environment.
API Gateways can transform requests and responses to meet specific needs.
Caching responses at the API Gateway can improve performance by reducing the need to invoke APIs for repeated requests.
Logging is an important feature of API Gateways for monitoring and debugging purposes.
API Gateways handle millions of requests per second by leveraging regions and availability zones.
Regions and availability zones provide redundancy and prevent single points of failure.
DNS-based load balancing is used to distribute traffic between different regions and API Gateways.
DNS is not a single point of failure due to its distributed nature with local and authoritative servers.
Service Discovery provides the location for microservices to the API Gateway for routing requests.
Load balancers work at the microservice level, distributing traffic among instances.
API Gateways decide which load balancer to invoke based on the API request.
Transcripts
hey guys Shan the side and welcome to
concept and coding and today in system
design playlist I'm going to cover a new
topic which is very very important which
is API Gateway there are so many
questions which built on top of it and
I'm going to tell you the complete uh
overall design where this API Gateway
fit and these two are the most
frequently interview question get asked
if API Gateway is a single entry point
how does it handle millions of request
per second or sometime also like how API
Gateway is different from load balancer
does it come before or after the load
balancer and trust me so many Engineers
has doubt on this
one what is API
Gateway in simple term it accept the
client API request and Route them to
correct backend service based on API end
point read this line again because this
is what makes it different from a load
balancer so let's say that this is a
client and API Gateway is in mid so it
accept your
request Now API get will will decide hey
if it is this API / API invoice it has
to route to this micros service invoice
micros service if API structure is like
this it has to route to order
microservice if API structure is like
this it has to route to sales micros
service okay so it route the API to
correct backend
service isn't this what load balancer
also do
no generally load balancer simply
distribute the traffic to multiple
instance of a
microservices so for example once the
request is coming load balancer task is
to just equally distribute the traffic
between multiple instances of a same
microservice so invoice microservice 1
invoice microservice 2 invoice
microservice three and many
more but it do not have the capability
to understand an API okay and then take
a decision where to Route so load
balancer doesn't have capability to
understand this API okay it is slash
invoice I have to move it to invoice
otherwise I have to move it to order no
and that's the m major difference
between API Gateway and load
balancer so apart from routing what API
Gateway is doing right based on the end
point it decide where to Route which
microservice to
Route it has so many other capabilities
right API Gateway is very intelligent
compared to load balancer so fun thing
is API composition so here if you see I
will tell you the problem with the API
composition so now let's say that uh you
are opening a e-commerce website and you
are going to like view my order view my
order click on button now you have to
see your past purchase history now let's
say if you are uh clicking that option
from mobile device so mobile device has
very less uh bandwidth you can say that
so we show less details so we show let's
say only these two details product
details and invoice details so mobile
device client is using an API just uh
take it as an example one from the
product microservice and one from the
invoice microservice and fetching the
detail of product and fetching the
detail of invoice and showing the
details but now let's say if you are
using computer or a personal computer
there it has more space and uh bandwidth
so you might show more details so maybe
the same page for the PC device client
apart from product and invoice it is
also showing ratings and reviews it is
might be also showing recommendations so
even though the same page but based on
the device also you see the
functionality
differs okay now on the PC device L you
need to query more apis or maybe it is
possible that different microservices
also to F those
dils now here if you see this is an
additional headache on the client so
with the help of API composition feature
of API Gateway this can be simplified so
now here if you see that in the API
Gateway we can configure the endpoint
let's say/ API SL my order if the client
calls this Now API Gateway has a
intelligent that okay if it is a mobile
device I will call this two API this two
Microsoft services and F the product
detail and invoice
detail API Gateway has the intelligent
okay if it is a personal uh computer
device then I have to call additional
apis okay gather the result and return
the single response so now here if you
see that the client work is and the
complexity is reduced a lot so this is
this feature is known as API composition
and it is very very important and it is
very frequently used used
in Netflix so
Netflix the API Gateway which uses it is
highly used this API composition okay
then we have another feature
of API gateways authentication so what
happen is so let's say you have a client
they are calling an API Gateway okay Now
API Gateway also has capability to
authenticate the client so now let's say
that if it is using o 2.0 flow so in the
first step the client get an access
token from the authorization server if
you if you don't know about the O 2.low
check out this hld playlist I have
covered in depth of this or 2.low so it
would becomes uh clear to you so in the
first step the client get an access
token from the authorization server now
any subsequent request the client pass
this token to the API Gateway
API Gateway integrates with
authorization server and validate this
token hey the client it is trying to
access an API and it gives me this token
is it a valid token or not now
authorization server validated yes yes
or no if it is yes valid then only API
Gateway allow it to pass to a
microservices instead of you are putting
an authentication logic here here here
you are duplicating the things so we are
putting the authentication at the front
itself only after authenticated it will
go
further the third important feature of
API Gateway is rate limiting so we can
set various rules like managing the
burst Ram burst limit so burst limit is
something like it's handling the
peak means maximum number of concurrent
requests that API Gateway can handle
before it returned 429 too many requests
so if you're using aw as aure API
Gateway you will have a option to set a
burst Limit you can put the value okay
500 something like this so what burst
Limit says that at a peak what is a
maximum concurrent request API Gateway
can
handle then we have API throttling API
throttling I would say that it's a more
granual level like
we can limit a particular individual or
an application right by blocking them
once they cross an allowed request rate
for example I say that okay this API /
API SL
invoice this API should not get invoke
uh more than 10
times in a minute
okay so with API throttling we can do
that so 10 times in a minute as soon as
the 11th call
come in a minute it will fail it this
will block it even you can go more
granual that okay but 10 times in a
minute by a particular
user by a particular user
also okay so this
all rules comes under API
throttling you can also set a rules to
IP based blocking that a particular IP
will get blocked and also API qes which
is uh you can say that a part of rate
limiting hold request to an API which
cannot be processed immediately it helps
to handle the uh thundering her issue so
as soon as subsequent traffic comes now
let's say that th000 request comes in
the B limit I have only said 500 so 500
request are will not be able to process
so we can put in into apiq they will
wait in the waiting area till your API
Gateway get the bandwidth to process
them so this rate limiting is also very
important feature another very important
feature is service Discovery very
important okay so now understand that
you have multiple components let's say
you have microservices you have load
balancer right there are multiple
instances of everyone
and today in the distributed World some
go scale up scale down so their IP
address and port number keep on changing
so you need somebody to keep track of
their location who does that service
Discovery a microservices can scale up
and scale down it's necessary to know
the
location IP address and Port service
Discovery keep track of those so there
are two different approaches approach
one whenever each micros Services uh
scales up or scal down it has to
register or deregister themselves so
they have to uh register to the service
Discovery and then only service
Discovery would be able to know about
their location another approach is that
service Discovery keep the heal check of
all the registered microservice and keep
only active microservices location so
earlier it has multiple microservices
which is registered in the service
Discovery so service Discovery keep
frequently Health checkup now let's say
if it is not able to see the heartbeat
of this one it will remove this one from
its location and only whatever the uh
somebody asks that hey this is uh
invoice microservice one invoice
microservice instance 2 invoice
microservice St 3 anybody ask for hey
give me a location for invoice
microservice service Discovery will
provide one of
them
okay so here you have a client
it is calling an API it goes to an API
Gateway Now API Gateway has the logic
that okay it's slash order means it has
to invo order micros
service now it has to know the location
before invoking that right so how API G
will know it will call service Discovery
there are various software like zul and
urea so they provide they serve as a
Serv service Discovery uh software so
API Gateway check with them hey I need a
location for order microservice service
Discovery will give it a location and it
will pass it to the proper micros
service okay so sometimes there is a
separate service for service Discovery
and sometimes this the complete
functionality itself is inbuilt there in
the API Gateway
so these four are the major capabilities
of API Gateway but there are other
capabilities like request response
transformation the request which is
coming and the response which goes out
you can transform this according to your
company needs you can even cash the
response which is going out you can cash
it so that the next time same request
comes uh you don't have to invoke an API
itself you can directly response from
here itself and even you can put
logging now if API Gateway is a single
entry point how does it handle million
of request per second so before I goes
to this diagram let me do some rough
here and then I will tell you so till
now what we have seen is we have
microservices let's say invoice
microservice and it has multiple
instances okay then you have
order
microservice and it will have its own
multiple instances now you know that who
will take care of uh Distributing the
traffic to a
single uh microservices to a multiple
instance it's load balancer so you have
one load balancer which takes care of
Distributing the traffic between
multiple instances of this invoice
microservices this is invoice microser
Services similarly you will have one
load balancer which takes care of
responsibility to distribute the traffic
to a multiple instances of order
microservices
now who will route the traffic to a load
balancer you
have API Gateway so whenever the request
comes to an API
Gateway API Gateway check with service
discovery
hey this is the API / API SL invoice
let's
say now with service Discovery it will
give me an address of this load balancer
here you have to interact
with this in this scenario and if it is
uh SL API SL order then API Gateway will
check with service Discovery service
Discovery will give an address of this
load
balancer and this load balancer will
take care of Distributing whatever the
traffic it is coming com among multiple
instances of that that so now I am just
extended this to a Next Level now here
if you see I bring couple of new points
region and availability zone so you need
to First understand what is region and
what is availability zone now let's
say when we have to answer that how it
handle millions of requests per second
and when it say that it's a single entry
point does it really a single entry
point
right then what things comes into mind
is we have to understand first region
and
AZ consider this region as I'm saying
let's
say uh
Mumbai Mumbai is one region and AZ is
particular area area one let's say area
2 uh let's say
bandra and uh now XY Z area inside this
region so now in this each a there they
have a dedicated data
center dedicated
data
center notice that this AZ so this is
let's say AZ 1 this is az2 in this
region this region Mumbai has 2 a
availability zone right uh maybe bandra
and some other area and each area has a
dedicated Data Center and they do not
share any resource
if anyone goes down it's not like the
complete region is down this region its
traffic will now move to this
one when this also both goes down then
only we say that hey region is down
region one is down then companies do
have different region also region two
let's say maybe Chennai in Chennai
multiple az1
az2 Okay so so let's say one a area is
Shing another is some other area so even
though one a got down doesn't matter
this region has another a if all the A's
availability Zone get down then then
also there will not be any single point
of failure because this region is also
we have multiple
regions so the same thing here so you
already know that I have already showed
you microservices
now slowly understand this now I'm going
to extend it further you have a
microservices one it has multiple
instances who control the traffic to a
multiple instance of a single
microservice is your load
balancer similarly you have multiple
different uh microservice itself let's
say this is order micros service 2 it is
also has its own instances
load balancer will take care of
Distributing the traffic to this
multiple
instances so this would be all under one
availability
Zone this is your one availability
Zone similarly you have
a
same structure in az2 different
availability Zone okay same
microservices one it's multiple
instances microservices to it multiple
instances and load
balancer Now API Gateway is one region
level let's say that any request which
comes to an API Gateway API Gateway
check with service Discovery service
Discovery has many criteria and rules it
uh checks let's say that user belongs to
area one and the nearest availability
zone is this so the location with
service Discovery will give to an API
Gateway so API Gateway will route the
traffic to this availability zone now
based on the API uh either this load
balancer get invoked or this load
balancer right if it let's say an
invoice API this load balancer which is
has an invoice micros service if this is
an order API this load balancer get
invoke which is an order microservice
let's
say similarly uh if the user is nearest
to this availability Zone API Gateway
get the location from service Discovery
for this a and then based on the API uh
appropriate load balancer will get
invo now here if you see that this is
region one similarly you can have
multiple regions like this region two
there could be possible Region Three and
inside each region you have the same
thing so one thing is very clear that
it's not a single point of failure if
both the load balancer will fail mean
this a got down this az2 will take care
of it now if this also got failed this
also got failed means your complete
region is failed
then region two comes into the picture
let's say region two will take care of
it okay so it's not a single point of
failure first thing now the question
which might to be coming that hey
how we thought that API Gateway is a
single entry
point but now we have a different
regions different API
Gateway how the request is divided
between different regions also who who
like here load balancing is Distributing
the traffic between single uh
microservice instance right similarly
API Gateway is kind of another micros
service you can say or a software and it
has its multiple instances here in
different different regions now who is
going to distribute the traffic is it
the load balancer who is going to
distribute the traffic
you can say kind of so here if you see
that the DNS comes into the
picture so if you're using AWS there is
something called AWS 53 route you can
say that it is considered as a DNS based
load balancer similarly if you are using
Azure there is something called Azure
traffic manager it's again a DNS B load
balancer so whenever the client makes a
call okay so a specific DNS based load
balancer distribute the traffic between
an appropriate API Gateway and again it
has the same uh intelligence like a
service Discovery okay which traffic
should goes to which region depending
upon the latency compliance maybe there
is certain countries let's say XY Z
country its traffic should not goes to
this region it can only be goes to this
region so no matter even though it's far
the because of liance the traffic has to
be moved to this region so it's a very
intelligent uh DNS based load balancer
and it can help to Route the traffic
between multiple regions and appropriate
API Gateway get the request from this
now if you are curious another question
might comes to you that hey isn't that
DNS is a single point of failure then
what if this DNS goes down who will take
care of uh moving the traffic to the API
gate
ways no this is again a very big topic
but DNS is not a single point of failure
it's not a single point of failure so
whenever you type let's say that hey uh
example XY
z.com so this has to be converted to an
IP address who helps it DNS helps it but
there is no single instance of this
there are multiple like local DNS this
is a hierarchy local DNS root DNS top
level DNS then it goes to authorative
DNS but maybe in future video of system
design I will explain the DNS in depth
if you have any doubt and question we
can discuss further and DNS I will cover
in the second part all right thank you
bye
5.0 / 5 (0 votes)