Microservices with Databases can be challenging...
Summary
TLDRThis video explores the complexities of managing databases in a microservices architecture, highlighting anti-patterns like shared databases and advocating for separate databases per service. It delves into transaction management, introducing patterns like two-phase commits and the saga pattern, which help maintain consistency across distributed systems. The script also covers API composition, CQRS for scalability, and event sourcing for tracking changes, all while emphasizing the importance of proper documentation and design patterns in microservices.
Takeaways
- 😀 Microservices offer numerous benefits, but managing storage and databases within a microservice architecture can be challenging.
- 🔄 The video discusses anti-patterns, such as the shared database pattern, which reduces modularity and scalability and can lead to issues like deadlocks.
- 🚫 Deadlocks occur when multiple transactions wait for resources locked by each other, causing operations to hang indefinitely.
- 🌐 The video recommends using separate databases for each microservice to improve decoupling and avoid technical issues like deadlocks.
- 🔄 Distributed transactions are complex in microservices architecture, and the two-phase commit protocol is introduced as a method to ensure consistency.
- 🤔 The two-phase commit can be cumbersome with many microservices, leading to the introduction of the saga pattern, which handles transactions in an event-driven manner.
- 🛑 The saga pattern involves message brokers and allows for rollbacks in case of transaction failures, but it requires implementing custom rollback functions.
- 🔍 API composition is presented as a valid pattern where a parent service aggregates data from other services, simplifying querying processes.
- 📈 To improve throughput, Command Query Responsibility Segregation (CQRS) is suggested, which separates read and write operations into different databases.
- 📝 Event sourcing is highlighted as a pattern suitable for tracking all changes, especially useful for compliance reasons in certain industries like finance or government.
- 📚 The video concludes with the use of an AI tool to generate documentation for microservices and databases, showcasing its capabilities for developers and architects.
Q & A
What are the benefits of using microservices architecture?
-Microservices architecture brings benefits such as modularity, scalability, and isolation, allowing for easier scaling and management of different services.
Why is the shared database pattern considered an anti-pattern in microservices?
-The shared database pattern is considered an anti-pattern because it loses the benefits of modularity and scalability. It can lead to issues like deadlocks and challenges in maintaining consistency and design across different teams.
What is a deadlock in the context of database management?
-A deadlock occurs when two or more transactions are waiting for each other to release resources such as locks on database objects, resulting in none of the transactions being able to proceed.
How can separate databases for each microservice improve the architecture?
-Separate databases for each microservice allow for better decoupling and easier scaling. They prevent issues like deadlocks from affecting multiple services and enable independent scaling of services and their databases.
What is a distributed transaction and how can it be challenging in microservices?
-A distributed transaction involves multiple services updating different databases simultaneously. It's challenging in microservices because it requires ensuring atomicity and consistency across different services, which is not straightforward when using separate databases.
What is a two-phase commit and how does it help with distributed transactions?
-A two-phase commit is a method to ensure consistency in distributed transactions. It involves a 'prepare' phase where services are asked to prepare for a commit, and a 'commit' phase where, if all services are ready, the transaction is committed across all services.
What is the Saga pattern and how does it differ from the two-phase commit?
-The Saga pattern is an alternative to the two-phase commit for handling distributed transactions. It involves a series of local transactions where each transaction produces an event that triggers the next transaction. It allows for better handling of failures and rollbacks compared to the two-phase commit.
What is API composition and how is it used in microservices?
-API composition is a pattern where a parent service aggregates data from other services by making separate API calls to each service. It simplifies querying data from multiple services and databases without the need for a shared database.
What is CQRS and how does it improve throughput?
-CQRS stands for Command Query Responsibility Segregation. It separates read and write operations into different models, allowing for the scaling of read operations independently from writes, thus improving throughput.
What is event sourcing and why is it useful for tracking changes?
-Event sourcing is a pattern where every change to the state of an application is stored as a sequence of events. It is useful for tracking changes because it maintains a log of all transactions, allowing for the reconstruction of past states and compliance with tracking requirements.
How can documentation tools like Erase AI assist in creating documentation for microservices?
-Documentation tools like Erase AI can generate outlines and comprehensive documentation based on specified topics and requirements. They can help developers and architects by providing a structured overview of best practices, patterns, and anti-patterns related to microservices and databases.
Outlines
😀 Microservices and Database Challenges
This paragraph introduces the complexities of integrating databases within a microservices architecture. It highlights the benefits of microservices but points out the difficulties in managing shared databases. The speaker discusses anti-patterns, such as the shared database pattern, which can lead to loss of modularity and scalability. Deadlocks and transaction issues are also mentioned as common problems when multiple services interact with the same database. The importance of understanding eventual consistency and proper database coupling is emphasized.
🔄 Distributed Transactions and Patterns
The second paragraph delves into the challenges of handling distributed transactions across separate microservices databases. It explains the concept of a two-phase commit as a solution to ensure atomicity in transactions. The process involves a 'prepare' phase, where services are asked to prepare for a commit, followed by a 'commit' phase if all services are ready. The paragraph also introduces the saga pattern as an alternative for handling transactions in a more scalable and event-driven manner, which can be more complex but also more flexible for larger systems.
📚 API Composition and CQRS
This paragraph discusses API composition as a method to aggregate data from different microservices, which is a valid pattern for microservices architecture. It also introduces the concept of Command Query Responsibility Segregation (CQRS), which separates read and write operations to databases. The benefits of having separate databases for reads and writes are explained, such as improved scalability and throughput. The paragraph also touches on the challenges of maintaining consistency between read and write databases and the potential solutions, including immediate replication or using event sourcing.
🗂 Event Sourcing for Data Integrity
The fourth paragraph explores the event sourcing pattern, which is particularly useful for applications requiring a log of all changes for compliance or historical tracking. Event sourcing appends events to a database, allowing for a detailed history of transactions. The architecture implications of event sourcing are discussed, including the use of event logs like Kafka and the creation of a read database that can be updated with snapshots of the application state. The paragraph also mentions the complexity of implementing event sourcing and its suitability for specific use cases.
📝 Documentation with AI Assistance
The final paragraph showcases the use of AI for generating documentation, specifically for the topic of microservices and databases. The speaker uses the tool 'eraser' to create an outline for documentation that covers best practices, patterns, and anti-patterns related to microservices and databases. The AI tool is praised for its ability to generate comprehensive documentation tailored to developers and architects, and the speaker encourages viewers to try the tool for its efficiency and utility.
Mindmap
Keywords
💡Microservices
💡Storage and Databases
💡Design Patterns
💡Anti-Patterns
💡Eventual Consistency
💡Deadlocks
💡Distributed Transactions
💡Two-Phase Commit
💡Saga Pattern
💡API Composition
💡CQRS
💡Event Sourcing
Highlights
Microservices offer numerous benefits, but managing storage and databases within a microservice architecture can be complex.
Different design patterns exist for coupling databases with microservices, including anti-patterns to avoid.
The 'shared database' pattern is an anti-pattern, leading to loss of modularity, scalability, and potential deadlocks.
Using separate databases for each microservice improves decoupling and scalability, avoiding technical issues like deadlocks.
Distributed transactions in microservices require special handling, such as the two-phase commit protocol to ensure atomicity.
The saga pattern is an alternative for handling transactions across multiple microservices, using a series of local transactions.
API composition is a method for aggregating data from multiple services, which is a valid approach in microservices architecture.
Command Query Responsibility Segregation (CQRS) separates read and write operations to databases, improving throughput.
Event sourcing is a pattern that logs every change as an event, useful for tracking and compliance in certain domains.
The use of a message broker in the saga pattern facilitates coordination between microservices during transactions.
Automatic replication from a write database to read databases can be set up to ensure data consistency with less immediate overhead.
Rollbacks in the saga pattern require custom implementation, adding overhead but ensuring transaction consistency.
Materialized views can be used to read data from a database, updating periodically from event logs.
Eraser.io is highlighted as a tool for creating and modifying AI-generated diagrams for microservices architecture.
The video also demonstrates using Eraser.io to generate documentation outlines for microservices and databases.
Documentation is crucial for developers and architects to understand best practices and patterns in microservices.
The video concludes by showcasing the generated documentation outline, emphasizing the utility of Eraser.io's AI features.
Transcripts
microservices bring a lot of benefits
there's no doubt about that but at the
same time it gets kind of tricky when
you're trying to deal with storage and
databases within your microservice
architecture because the question is how
do you actually properly couple them
together well turns out there are
different design patterns for that that
we're going to learn in this video and
to start with some anti- patterns
meaning how not to do that and of course
we're going to learn very important
things such as eventual consistency
Deadlocks transactions and so on over
the course of this video so if you're
ready buckle up and let's get started
all right folks so we're going to start
with the very first pattern or rather an
anti- pattern and I'm going to explain
why which is called shared database
pattern and I'm going to use erasers AI
diagrams here to save time because I'm
lazy and I'm simply going to tell it
that it's a cloud architecture and paste
this text so I'm going to create two
microservices payment and user services
that are is to instances that to the
same shared database separately let's
click generate and let the AI do its
thing so as you can see we have a shared
database and our services are talking to
it separately now why is this considered
an anti pattern well we're losing one of
the best features of microservices which
is modularity and scalability and or you
can say isolation instead of modularity
so usually whenever you have a separate
database attached to your service you
can scale much easily you can simply
create new instances automatically which
is autoscaling but in this case we're
bound to the same database so apart from
the human problems or human aspect which
is two teams trying to manage the same
database and which is going to lead to
worse consistency and design because
yeah maybe sometimes it's not that easy
to align with different teams on how to
design your database but apart from that
working on the same database with
different microservices can actually
introduce technical issues as well for
example Deadlocks what's a deadlock
let's read about it in a database
management system a deadlock occurs when
two or more transactions are waiting for
each other to release resources such as
locks on database objects that they need
to complete their operation as a result
none of the transactions can prede
leading to a situation when they are
stuck or deadlocked so basically when a
payment service tries to make like a big
update on the database it's going to
lock the table so that there's no other
service that kind of modifies this data
while the update is happening which can
lead to data in inconsistency or loss
that's why it's going to lock this table
and then at the same time user service
tries to update this table and it fails
because the database or the database
table is under a deadlock so as you can
see there are many reasons why not to
use a shared database for your
microservice of course you can go with
this if you have a smaller application
and not that much load on your plat
platform but if you're trying to scale
in the future definitely don't use
shared database all right instead what
you can do is the following so let's
clean this up a bit I'm going to move
this here and I'm going to remove my
lines and arrows so what we can do is
actually eraser allows us to modify this
AI architecture or or use this AI tool
to modify our diagram by simply editing
a prompt so I'm going to say now use a
separate database for each micro service
and let's see what it's going to
do all right it recreated the diagram
and it created separate databases I'm
somehow getting fascinated by how smart
the AI is when drawing diagrams so now
this is a better pattern because we can
scale separately so we can scale the
this service and a database separately
now it's probably not depicted here but
payment service and payment database are
going to be living in separate Docker
containers so usually it's not a good
practice to pack them all together
within one Docker container but if aside
from that now we have better decoupling
so the requests coming here are going to
take their own time even if this
database gets deadlocked deadlocked
there's no issue that user service is
going to suffer from this all right this
is a better architecture when you're
trying to deal with microservices now
there are some things called
transactions as you can see this diagram
is kind of cool we have separation but
how do we actually deal with
transactions so let's say we have
another service so let's modify and say
order service is going to call both
payment and user service so let's
generate another service that's going to
connect to our two services like this
now imagine this diagram order service
wants to update the payment information
and the user information at at the same
time this is called a distributed
transaction now how do we deal with
distributed transaction because if we
were having a shared database we could
easily do a transaction within one
shared database and maybe lock it of
course which introduces a problem but
now we cannot the order service cannot
simply do a transaction saying ay user
service do your thing and at the same
time payment service do your thing and
we're just safe all right because what
if this transaction that is Atomic so
follows the asset principle the
atomicity means that whatever started
can be rolled back all right so if we're
trying to assure that both of these
databases are going to be eventually
updated what if one of them actually
errors out and but user service proceeds
so the user database is going to be
updated but the payment information is
going to be old so we don't want that
but how can you ensure this when you're
having microservices this is one of the
issues that comes up whenever we Dr deal
with with distributed transactions and
there is one work around for this called
a two-phase commit what a two-phase
commit means is that we're going to have
two phases in order to ensure
consistency within our transaction all
right so what this going to look like is
the order Service First is going to
issue a uh one phase which is called
prepare so it's going to tell user
service hey guys prepare for an up
database update and the payment services
are going to start a transaction to
begin so if we look here I'm looking at
pogress so pogress lets you do
transactions in the database like this
so you can start with a begin query and
then you can Define your query and then
before doing an in or you can Define
your query with an insert and then
committing is already the second phase
that we're talking about all right so
before committing if there are no errors
happen until this point this payment
service can tell back to the order
service that hey everything is looking
good there are no errors there are no
Deadlocks so I can actually commit as
soon as the order service knows that the
user service and payment service are
ready to commit IT issues the second
phase which is the actual commit phase
and then the second phase lets the user
service and payment service know that
hey you have had these you already have
these queries ready just commit them all
right we're going to commit them and
everything is good but if one of them
fails the user service says hey I'm not
ready to commit so what we're going to
do do is we're going to roll back
meaning we're going to close this
transaction okay how cool is that so
this is one of the ways of dealing with
distributed transactions but there's one
problem so in our use case we have only
two microservices what happens if we
have 10 other microservices that the
user service is talking to so imagine we
have another inventory service we have
another I don't know customer service
and so on it's very hard to orchestrate
all of that so order service is going to
issue kind of first phase to all of them
and then it's basically going to be a
mess until you wait for the second phase
that's why people came up with another
pattern called a saga pattern so I'm not
talking about Harry Potter or Star Wars
but it's literally called Saga pattern
kind of means that these transactions
are going to be relate to each other so
for this I'm actually going to create
some microservices let's say this one is
order and this one is payment this one
is going to be user so we're going to up
update data in all of those databases
and that's one is going to be inventory
all right so user and inventory and they
of course have their respective
databases so let's take a database here
and place a database for all of them
here so they're going to be living in
the bottom of these services and here
and here so the Saga pattern is going to
look the following way and this is by
the way our second pattern that we can
use so the S pattern is going to have a
message br broker so something that we
already covered in one of our previous
videos this can be a rabbit mq all right
so we're going to have a broker such as
rabit m q and this is our broker so
what's going to happen is the order
service is going to issue an event that
hey I need to update something to the
message broker the payment service is
subscribed to this it's going to pick it
up and as soon as it's processed it's
going to issue another event that the
payment has been processed then the user
picks it up and then the user issues
another event that's it's done updating
its database here and then eventually
the inventory picks it up and updates
its database here so as you can see all
of them are updating their databases one
after another and they're doing this in
an event driven fashion so how do we
deal with rollbacks then well the
rollbacks are going to work the
following way if this one fails if the
payment fails then of course user and
inventory are not going to be able to
update their databases which is good for
us but we can also roll back the order
updates so the update on the database is
going to be rolled back but not so fast
because it's not so easy to implement
that this literally means that you need
to implement all of those roll back
functions yourself so there's some
overhead additional overhead is not
always good but this is one way of
ensuring an actual um distributed
transaction consistency all right and
this is the second pattern that you can
do with your databases basically use
Saga pattern just keep in mind that it
can get a bit complicated all right so
far we implemented two or looked into
two different patterns the shared
database which is an anti- pattern so
we're not talking about that the
separate databases and Saga pattern now
there's another one which is called a
composition so let's take our diagram
here to make it clean so whenever we're
looking at this one actually so this is
already called a composition so let's
change the title of this that we can
easily do so API composition and you
might be asking what does it actually
mean so we are apparently already using
API composition which is if we have two
Services here payment service and user
service and we need to query data from
or their respective databases it's quite
easy so quering is much easier than
updating what we're going to do is we
simply going to have an order service
that can query payment service first of
all and then it can query the user
service and then wait for the events to
finish and then simply aggregate all the
results here in the order service okay
so it's called an event or or API
composition meaning you can always have
a parent service that tries to aggregate
data from other services and this is
totally fine this is not a antip all
right so going further what can we
actually do in order to improve our
throughput so to say all right so let's
say we don't have the order service
anymore or let's say we have the order
service but no other services so I'm
going to remove to other services and as
you can see this is super easy with
eraser .io and let's keep the database
maybe but call it order database and we
are going to connect the order service
to order database like this and we're
going to delete the other connection all
right so like this so what we're having
here is that the order service is going
to talk to the order database but
whenever we're dealing with a lot of
data it's actually a good practice to
separate your view database meaning
whenever you're doing uh select compar
to your update so what we're going to do
is we're going to create another
database like this so we're going to say
order view database and the second one
is going to be order write or rather not
view but read all right so we have now
two databases and what we can do is
probably delete this and we're going to
say order service reads from order read
database and it's also going to connect
to order write database oops so order
ser is connected to order the right
database so whenever we're reading the
information we're going to read it from
here and whenever we write we're going
to write it here and as reads happen
more often than wres we can separately
scale this database many times so we can
literally take this database and scale
it one time and two times all right now
we have more throughput whenever we try
to read the information from our API
which is cool and this is called CQ Q RS
so command query responsibility
segregation because we're dividing the
responsibility between our reads and
writes and now you might ask but how do
we actually ensure consistency well
there are separate ways all right so
whenever you were writing to the order
database what we can do is at the same
time write to all of our read databases
as you can see this is probably not
efficient because we need to make sure
that we ritee right to all the replicas
so this is a bit of a overhead but we
can also kind of make sure that there's
an automatic replication from write to
read database it's going to happen
within a millisecond probably so there's
an automatic script that can do every
time you write to the right database
it's going to update the read databases
as well but as you notice this is also
might not be perfect in some cases why
because if the updates of your database
are not that critical so if the fact
that as soon as the right database is
updated updating your read databases at
the same time within the same second is
not crucial maybe there's something else
that we can use because otherwise
updating all the read databases as soon
as the right database has been updated
also adds some load to your server okay
otherwise this is quite legit but if our
if the time is not so critical for
example imagine an eCommerce website
where users are updating their shopping
cart and it's not so critical to update
the shopping cart of a user as soon as
the user clicks on plus but you know one
or two seconds of delay is totally fine
maybe we shouldn't be updating our read
databases so quickly but what what we
can do instead is use a an event
sourcing pattern so what is an event
sourcing pattern event sourcing pattern
is especially good in some of the use
cases such as finances I'm not talking
about stock market because stock market
needs realtime updates but finances like
Banks where you want to see all of your
transactions in the past so we're going
to keep literally the log of every
transaction or some other governmental
organizations there can be many use
cases so what is the difference so let's
create a
diagram and this one is going to be an
entity relationship so let's say create
a a user table with three different
users as rows okay I changed it to ID
name and surname and let's generate it
now
and we're going to get a database with
three different uh rows all right so
whenever we update our user let's say we
want to update it surname or name the
update is in place meaning we're losing
the track of the old name but what if
for compliance reasons we wanted to see
their older name as well maybe we are
some kind of a governmental crucial
organization that need to track all the
changes within the system well that's
why you can use an event sourcing system
all right so event sourcing is something
that's going to basically append events
to to to the database so one event which
was create user and there's some
Associated data with this another one
was update user so we are already going
to be able to find the information when
the user was created so maybe the name
was Tim and then later it got updated to
John so if we go back into the events
and see that whenever the user was
created it was Tim can easily track this
back so it's not that easy to do it with
traditional databases that's why it's
good to have an event log so how is this
going to look within the architecture of
your database so let's say you have a
service backend service and it's going
to or let's say it's actually a user
service so like this it's going to
update or connect to an event log and
event log can be actually kka stream and
we're going to write all the logs in
here such as oops let's remove that such
as update user and then again update
user then maybe we want to want to
delete the user so if if we if our
business requires to uh save all these
changes within our database we can save
them within the cka stream and they can
be stored there as long as possible all
right so here's our first connection but
what happens if we want to read that
data all right this is very very tricky
So reading the data will be from an
actual database all right so we're going
to have another DB here and this is
going to be a read DB so what happens is
we can have changes on this kavka stream
and every time a change happens we kind
of stream it to our read database so
whenever there's another service maybe
it can be a user service or some other
service we want to read the data we're
going to read it from a materialized
view so first of all the question is how
do we update the read database at the
same time well you can actually listen
to different events all right not only
in mongodb collection you can listen to
different events triggers in my SQL all
right you have triggers meaning it kind
of listens for events but you also have
the same similar thing in Kafka so as
soon as the event happens we can stream
it to a database all right we're going
to look into the code and then actual
implementation in the future video but
for now just keep in mind this concept
okay so we have a read database and the
read database is basically going to take
all the snapshots that happened in
between so it's not going to store all
the events because otherwise it would be
replication but it's going to save all
the snapshots meaning the states of the
application at a certain point of time
okay one more last thing that I want to
show you is actually something that has
to do with documentation and razor
actually we have this tab called
document and eraser recently they
actually let me know that we can now use
their AI features fully because it's so
great as you saw we can actually even
generate documentation here so as you
know we talked about microservices and
databases so I can click on generate
outline and I can kind of create
documentation for our video so I'm going
to say create a document that talks
about micr
services and databases with their
different best practices and pattern
also include the anti patterns and let's
say what it generates so the technology
stack I'm going to say SQL I'm going to
say Kafka and so on so let's click
generate what is the desired level of
detail for now it's just an overview
what's the primary goal of the document
to offer to private a comprehensive
overview what's the intended audience
let's say developers and Architects what
aspects of SQL in relation to micros
service should be covered best practice
basically it's going to ask you
different questions that you can specify
and then it's going to generate this
document so let's click on generate and
now our documentation is going to be
generated basically I even could have
used this to go as an outline for our
video how cool is that so guys give it a
try eraser is amazing and they're
actually my friends that I'm also in
contact with and I've been enjoying this
application since then so this was it if
you enjoyed the video always as always
give us a thumbs up and I'm going to see
you guys in the next video goodbye
Weitere ähnliche Videos ansehen
5.0 / 5 (0 votes)