How to Build a Streaming Database in Three Challenging Steps | Materialize
Summary
TLDRThe video script discusses the concept of a streaming database, specifically Materialize, which operates as a database that anticipates and reacts to data changes autonomously. It contrasts traditional databases, which require user commands to perform actions, with streaming databases that can proactively work on behalf of the user. The talk delves into the scalability and cloud-native aspects of streaming databases, highlighting the importance of decoupling layers for performance and consistency. It also introduces the concept of virtual time to coordinate operations across different components of the database system, ensuring accurate and efficient data processing. The demonstration showcases the system's ability to handle concurrent queries, maintain low latency, and scale up or down without interrupting services, providing a seamless experience for users.
Takeaways
- 📚 Materialize is a streaming database that allows for real-time data processing and interaction.
- 🔄 It supports standard SQL operations, enabling users to create tables, insert data, and run queries.
- 🚀 Streaming databases proactively work on behalf of users, anticipating needs and preparing data ahead of time.
- 📊 The concept of 'create view' in SQL is used to establish a long-term relationship with a query, allowing the database to anticipate and prepare for future requests.
- 🔧 Work done in response to 'create view' is an ongoing cost, but it serves as a prepayment for potential future work, improving efficiency.
- 📈 Streaming databases offer a new dimension in deciding when work gets done, either as data is ingested or when a query is made.
- 🌐 Scalable cloud-native streaming databases allow for the addition of more resources without disrupting existing use cases.
- 🔗 Virtual time is a crucial concept that decouples the storage, compute, and adapter layers, allowing for independent scaling and coordination.
- 🔄 The storage layer ensures durability by recording data changes with timestamps, providing a consistent view of updates.
- 🧠 The compute layer processes data flows, maintaining views and indexes for low-latency access to data.
- 🔌 The adapter layer coordinates SQL commands, providing a facade of a single, consistent streaming database experience.
Q & A
What is a streaming database and how does it differ from a traditional database?
-A streaming database is a modification of a traditional database that allows it to take action on its own, anticipating user needs based on data changes. Unlike traditional databases that respond to user commands, streaming databases can proactively perform tasks, such as updating views or indexes, in response to data changes without explicit user queries.
What is the significance of the 'create view' and 'select' commands in SQL in the context of a streaming database?
-In a streaming database, the 'create view' command is used to establish a long-lived relationship with a query, hinting to the database that the query will be accessed repeatedly. This allows the database to anticipate and prepare for these queries, potentially improving performance. The 'select' command, on the other hand, is used to request immediate results from the database, which can be faster due to the pre-emptive work done by 'create view'.
How does a streaming database optimize the trade-off between work done at data ingestion time and when a user asks a query?
-A streaming database optimizes this trade-off by performing work at data ingestion time, which is an ongoing cost as data changes. This prepayment of work can lead to faster response times when a 'select' query is issued, as the database has already done some of the necessary processing in anticipation of the query.
What is the role of virtual time in a scalable, cloud-native streaming database?
-null
How does the storage layer in a streaming database ensure durability?
-The storage layer ensures durability by recording data updates with timestamps and maintaining a log of these changes. It surfaces these updates consistently and durably, allowing for the reconstruction of the data at any point in time, thus providing a reliable foundation for the rest of the system.
What are the challenges faced by the compute layer in a streaming database?
-The compute layer's main challenge is to process and maintain views as data changes, transforming time-varying collections into output while ensuring low latency access to data. It must also manage the trade-off between maintaining indexes for fast access and writing results back to the storage layer for other compute instances to use.
What is the function of the adapter layer in a streaming database?
-The adapter layer is responsible for providing the appearance of a single, consistent streaming database. It sequences SQL commands with timestamps, ensuring that the system behaves as if all events occur in a total order. This layer also manages the consistency of the system, allowing for multiple independent operations to occur without interfering with each other.
How does a streaming database handle scaling and the addition of new use cases?
-A streaming database handles scaling by allowing users to add more streaming databases without affecting existing ones. This is achieved through the use of virtual time, which decouples the execution of different layers, allowing them to operate independently and scale as needed without cross-contamination of work.
What are the benefits of using a streaming database for low-latency applications?
-Streaming databases provide real-time data updates and allow for the materialization of views, which can significantly reduce query response times. This makes them ideal for low-latency applications where users require immediate and interactive access to the most current data.
How does a streaming database ensure data consistency across multiple users or teams working independently?
-By using virtual time, a streaming database ensures that all users or teams working with the database see a consistent view of the data, as if all operations were executed simultaneously. This eliminates the need for each team to manage data synchronization manually, simplifying the development and maintenance of complex applications.
Outlines
📚 Introduction to Streaming Databases
The speaker introduces the concept of a streaming database, emphasizing its ability to anticipate user needs and perform actions autonomously. The Materialize database is highlighted as a streaming database that supports standard SQL operations, allowing users to create tables, insert data, and run queries. The key difference is that a streaming database can initiate work based on data changes, rather than waiting for user commands. The speaker also discusses the trade-offs between work done at data ingestion time versus when a query is issued, and the benefits of pre-emptive work in terms of performance and user experience.
🔄 Auction Data Use Case
The speaker presents a hypothetical auction data use case to illustrate the streaming database's capabilities. The scenario involves tracking active bids and outbids in real-time. The speaker explains how Materialize can create views to collect and update this information, providing a live feed of auction dynamics. The use case demonstrates the streaming database's ability to maintain low latency and high interactivity, as well as its potential for scalability and economic efficiency in data processing.
🛠️ Scalable Cloud Native Streaming Databases
The speaker delves into the architecture of a scalable cloud-native streaming database, highlighting the importance of coordination and consistency across multiple layers. The three layers discussed are the storage layer for durability, the compute layer for processing, and the adapter layer for user interaction. The concept of virtual time is introduced as a mechanism to coordinate these layers, allowing for decoupling of execution while maintaining a consistent view of data changes. The speaker emphasizes the benefits of this approach, including the ability to scale and handle multiple use cases without interference.
🔄 Data Flow and Processing
The speaker explains the data flow and processing within the streaming database, focusing on the compute layer's role in transforming time-varying collections. The layer's ability to handle CDC (Change Data Capture) streams and maintain state in memory for rapid access is discussed. The speaker also touches on the potential for writing results back to the storage layer for other compute instances to use, emphasizing the importance of determinism in ensuring correctness and performance.
🔄 Scaling and Consistency
The speaker demonstrates the scaling capabilities of the streaming database by showing how additional compute resources can be added or removed without interrupting the interactive experience. The ability to handle failures and maintain data integrity is highlighted, as well as the system's resilience to changes in compute resources. The speaker also shows how the streaming database ensures consistency across different compute instances, even when they are performing the same task, by using virtual time to synchronize data updates.
Mindmap
Keywords
💡Streaming Database
💡Materialize
💡SQL
💡Distributed Database
💡Create View
💡Virtual Time
💡Decoupling
💡Consistency
💡Scalability
💡Cloud Native
Highlights
A streaming database is a modification to a traditional database that allows it to anticipate and perform work on behalf of the user based on data changes.
In a streaming database, the work can be done either as data is ingested or when a user asks a query, giving users control over when the work occurs.
The concept of 'create view' in SQL hints to the database that the user will frequently access the data, allowing the database to prepare and maintain the data efficiently.
Materialized views in a streaming database can be updated in real-time, providing a more interactive experience for users.
The trade-off in streaming databases is between pre-emptive work done during data ingestion (ongoing cost) and work done in response to queries (potentially faster response times).
A streaming database can materialize a lot of data with 'create view' statements, allowing users to continually monitor data changes without incurring additional costs.
The same SQL can be used for both querying and materializing data in a streaming database, provided by the same underlying execution engine.
A streaming database can maintain data up-to-date in proportion to the rate of change in the data, rather than the number of times it's viewed.
The concept of 'virtual time' is used in streaming databases to coordinate events and ensure consistency across different layers of the system without forcing synchronization.
The storage layer in a streaming database ensures durability by recording data changes with timestamps and maintaining a consistent view of the data.
The compute layer processes data flows, transforming input streams into output streams, and can maintain data in memory for fast access or write back to storage for broader consumption.
The adapter layer sequences SQL commands with timestamps, providing the appearance of a sequential total order and ensuring serializable isolation.
Scalable cloud-native streaming databases allow for the addition of more resources without disrupting existing use cases, providing a purely additive expansion.
Decoupling of layers in a streaming database allows for independent scaling and maintenance, improving performance and simplifying the system's complexity.
Virtual time enables the system to behave like a simulator, computing correct answers at each layer based on the timestamped events.
The storage layer's main challenge is ensuring durability, which involves writing down updates and being able to reproduce them exactly as recorded.
The compute layer's challenge is to process data flows efficiently, maintaining low latency and minimal resource usage while providing deterministic results.
The adapter layer's challenge is to provide consistency across the system by coordinating timestamps for SQL commands, ensuring the system presents as if it's a single, consistent entity.
Transcripts
what's a streaming database of this and
that and what I'm going to give you a
very specific thing that I'm going to
use for the talk that we have
materialize think of as a streaming
database
foremost it is a database streaming
modifies database so it's a database for
material as a SQL database and you do
standard SQL databases things with it
you connect to it you say I'd like to
create a table I'm going to insert some
stuff into the table I'm going to run a
select statement to read stuff back out
of the table I'll create some views I'll
create some indexes
all those sorts of things
but but everything I've just described
is is very poll oriented you ask the
database to do a thing it does that
thing for you it responds to your
commands and that's the basis on which
the database takes action
a streaming database is a modification
to such a database where there's some
sort of framework for the database to
LEAP into action on its own to do some
work on your behalf anticipating
potentially what you need to do but some
way that you're able to communicate to
the database hey you should go and do
some work potentially when the data
change rather than when I ask about the
data to uh to prep perhaps and work for
me and that's uh the direction we're
gonna be hitting that's what's going to
be exciting about a streaming database
is this this new dimension of when does
when does work get done
so a bit more specifically with some
concrete examples we'll build this up
the thing that I want you to think about
Distributing database is that users get
to trade off when work happens there's
two roughly times that work might happen
either as data are ingested
or when a user asks a query these are
two times that work could go on in the
database and a user gets to control this
they get to decide or guide the database
as to when this this work should occur
and the way to think about this uh that
I think is as easy as these two commands
that are pretty common in SQL people use
them a lot and they guide the databases
to when you would like the work to
happen so we have create View and select
probably a lot of you know select here's
a query I want the answers right now
please give them to me database and this
is crazy the day which now leap into
action and do that work for you
create view probably a lot of you know
what this is this is telling a database
hello I have a query that is so
interesting I'm going to give it a name
and I'm going to introduce you the
database to this query we're gonna have
a nice long-lived relationship with this
query and this is a great hint to the
database like I'm going to come back and
I'm going to ask about this view over
and over again uh please look at it
remember it anticipate potentially that
I'm interested in it
the the trade-off here the intended sort
of value proposition if you will is that
work that is done ahead of time the the
working done in response to create view
the work done that data ingestion time
is an ongoing cost as the data changed
that in some ways is a prepayment
for work that might need to happen when
you issue a select query so you know a
bit more potentially predictable uh done
in anticipation so when a select query
comes around surprisingly potentially
you have a head start on producing the
the results there it takes a lot less
time potentially to get those answers
back that might have otherwise taken I
don't know we'll see examples but you
know whole integer seconds uh to return
or or worse you've got the answer ready
to go and as a consequence you can give
a much more interactive experience
there's also really interesting Dynamic
here in terms of uh I guess economics is
the right way to think about it
um rather than redoing the work over and
over again in select statements you're
keeping work up to date so you're doing
work in proportion to the rate of change
in the data rather the number of times
you look at it and in the limit this is
really exciting you can materialize a
lot of data with create view statements
and then select from them every second
like just continually watch the data as
it spills out rather than trying to dial
down your use of snowflake order from
hours to days you go the other direction
you just look at it every second and uh
it's not any more expensive to do that
so it prompts hopefully a bunch of
people think like well I do something
fascinatingly different if they're
actually free to ask the questions more
and more often
it's important in this framework that
we're talking about the same sequel in
these two fragments it's it's really not
great if create view can only do some
counts or uh fairly limited types of
things
um this is a bit of an uncanny valley
where either it's the same SQL you can
both select from it or take what you're
going to select from and materialize it
uh and if you can't
consistently do that it's just not as
useful so it turns out and materialize
at least these are actually the exact
same SQL the same underlying execution
engine handles both of these types of
queries the things that will produce the
answer for you in the first place but
also then keep it up to date is the same
streaming scale out dataflow
infrastructure
all right I'm going to show a demo
um just to sort of warm you up to what
is a streaming database I'm going to
work through an example of a
hypothetical use case this could be
totally made up but I'm going to show
off what might it look like to do all
the work at query time to maintain it
all ahead of time and why might it be
good to do a little bit of both actually
all right so the in the demo I'm going
to log into materialize it just presents
as uh well you just go through P SQL it
looks like looks like postgres
there's a source uh I'm not going to
create the source so I had the source
running for a little while and I want
all the data that has produced this is
an auction uh data generator it's got
two relations of interest to us auctions
and bids uh it's good Bruce a lot of
them will just do some counts to see how
much data is in there but fundamentally
bids speak about auctions and auctions
have ending times
let me just pause this for a second so
that we can see well I guess we'll see
the number there we've got I don't know
almost half a million auctions but at
any particular moment if we restrict our
Ascension to the auctions that have not
yet expired it's about a thousand or so
they turn over pretty quickly in this
example just to keep things keep things
moving
and we're going to end up
with I don't know 2 million plus bids or
so
and the intent the the use case we're
going to put together you can sort of if
you read ahead on the right right side
over here you can sort of see where
we're going we're going to put together
a view that collects active bids so bids
associated with auctions that are still
in Flight that haven't expired yet let
me just Define a view for that called
active bids
standard SQL just a join between options
and bids
and then
pause it here we're going to take so
hopefully that makes
something's still in auction things
might change these you know potentially
very interesting you get up-to-date
information about Moment by moment
what are we going to do with that we're
going to put together a new view on top
of that it's meant to be useful to
people and this is uh I've called it the
outbids relation but fun about to anyone
who's bid in an auction currently active
might be interested to see what are the
other bids that have outbid me who am I
competing against right is there just
one other person with a relatively
similar bid or are there thousands of
people that I'm competing with and maybe
I should just go find something else to
bid on because it's not going to happen
just another join so this is a
self-joined between active bids and
itself on the auction ID where we
restrict our attention down to bids uh
the second bid should have a greater
value
and be a different bidder that's the
sort of interesting stuff that we're
gonna try to show back to people and say
like look look
uh you've been outbid by all of these
people
so uh the plan is to try to offer this
up to folks bidder shows up bitter 500
it's going to be and repeatedly says
show me what's going on with the
auctions I'm participating in who am I
losing to uh
sorry it uh
my Mastery of
Google Slides is limited
I'll just watch it here for a moment
we're going to do this three different
ways
so materialize has this concept of
clusters think of them maybe as
workspaces but these are environments
where you can set up uh views you can
just pull data in through select
commands we're going to start with one
that's the pull cluster and this is just
going to be
running a select running the entire uh
join join pipeline a bunch of filtering
pulling in all those millions of rows
and you can see uh in this in this case
it takes seven seconds or so not not a
I'm going to say not a great number
um or at least you know you should hope
for more with a streaming database up to
up to date information so that's really
good but uh I'm going to try to convince
you that you should you should want more
why does it take so long if you look at
the the query plan for this it's big and
gross it does a whole bunch of joins in
response to to issuing the query uh sort
of understandable that a bunch of work
just happened
on your behalf and then it takes some
time
so let's do it a different way
let's go to a different cluster we'll
call this one push and this is going to
uh essentially put together
a data flow materialization for the
entirety of outbids so we're going to
create an index here
I'm just calling it outbids by buyer so
an index on the entire output's relation
indexed by fire this will make it very
easy for us to show up and say for buyer
500 what are the results
and uh if we watch that
should happen
so we'll we'll run some queries here you
can see that instead of uh seven seconds
is taking a few hundred milliseconds and
uh this is actually an interesting
consequence of materialize providing by
default strict serializability as its
isolation layer which if you're familiar
with database is pretty strong uh it
doesn't get much better than that we can
dial it down though in the interest of
performance so we set the transaction
isolation just as serializable
and these numbers will drop down to sub
20 millisecond response times to hand
back up-to-date information about who's
out bidding buyer 500 in in various uh
various auctions why is it so fast
here's the query plan now it's just read
the data out of an index right it's not
a very complicated query plan not much
work is happening when you ask the
question we just go and read the data
out
well that sounds great why don't we just
do that I want to just materialize
everything all the way the problem uh if
you're if you have your your thinking
hat on is that
the uh the size of outbids for each
auction is quadratic in the number of
active bids in there every pair of
active bids one of them is visible to
the other one right one is an outbidding
of uh of the other which means if you've
got an auction with a thousand bidders
in it there's a million outbids uh
elements that are being maintained
continually that feels sort of bad if no
one's actually looking at them all right
so a lot of compute that needs to go on
a bunch of memory you need to keep this
stuff all Resident a bit of a waste of
resources if it's not the case that all
bidders are looking at at all moments
so we'll do this a third way now uh a
push pull away over here where we build
instead of an index on outbids we're
going to build two indexes on active
bidders
uh sorry active bids
two ways uh both by
um
Sorry video goes faster than I can talk
um both by buyer and by auction ID and
when we go and run queries from there
just give it a moment it's actually
still setting up the data flows why it
takes a few hundred things it's now on
the order of 20 20 milliseconds to get
the results back
now why is that all right so a very
different thing happens in this this
query plan again thinky hat goes on uh
if someone shows up and says I'm buyer
500 amazing we have an index on active
bids by buyer we can just go and leap
and directly get access to their active
bids
each of those active bids names an
auction and we can use again the active
bids indexed by auction ID to LEAP
directly to the relevant auctions there
we're just leaping around in indexes at
the moment nothing is scanning data
playing anything in all the data right
in memory index exactly the way we need
and it's just looking up information
however we're only maintaining data the
active vids is linear in the number of
input bits as opposed to potentially
quadratic so we can do this with a much
much lower resource envelope and just do
the expansion of the data when someone
actually asks
all right and as a final trick for
assuming databases
these are all select queries they show
you answers in response to you typing
things and pressing enter
in materializing good streaming
databases you should be able to
subscribe to these results you'll get an
answer that comes out as well as a
continually arriving change log thing at
every in this case every second how have
the data changed and in particular if
nothing has changed clearly communicate
that back out so this is
data streaming in changing query results
and streaming all the way back out to in
this case buyer 500 who's just curious
to see maybe their UI is curious to see
what are the auctions they need to be
paying attention to and the new buyers
who've entered the uh the auction
so that's the streaming database uh and
uh
some amount of time explaining but
hopefully that's like oh that's
fascinating I'd love to hear more about
streaming databases uh in particular you
might be
there we go uh interested in a few more
adjectives in front of the streaming
database um
so we'll talk a bit now about scalable
Cloud native
streaming databases and there's two
words I want to call out on here
um one is scalable I think a lot of
people I just want to take a moment to
say what I mean by by scalable for a lot
of folks scalable means more computers
more stuff
uh that's true more computers will be
involved in the story here but what
we're really interested in is more
streaming database so people have
potentially had success with their
streaming database like this was great
I've got more things I'd like to do with
it I would like to have more streaming
database
but in a way that doesn't screw up the
streaming database I already have right
so you want to give people more of the
streaming days in a way that's purely
additive if I give you some more
computers and then you turn on a second
use case and it tanked the first use
case right if your query latencies went
from 20 milliseconds to 100 milliseconds
that's not good like that thanks for the
additional computers and stuff but but
my first use case is now broken because
of these sort of cross-contamination of
all this work so I say scalable I
actually really mean figuring out how to
give people who want to use streaming
datases more and more and more of the
streaming database rather than just
computers and bytes and stuff like that
the other word on here that's important
is a you want a
scalable coordination they don't want 27
of them uh like if I just told you
here's a different one for each of your
use cases enjoy that defeats the purpose
of having a database the database exists
so that lots of different use cases
users can use the same information
produce results that can be integrated
together and more value derived from
that a bunch of different independent
silos of streaming databases are not
going to solve the problems that
organizations have
that's where the tension comes in of
course if these things have to work
together but shouldn't interfere with
one another well you know the easy
solutions go out the window and we have
to we have to start to be a bit smarter
we have to start to have multiple
multiple steps
so here's uh three steps
and looking at this uh it's sort of
painfully obvious this is like a fairly
standard picture actually for what a
cloud data warehouse looks like
there's going to be some differences
though and maybe we're going to pay
attention the differences as we go
so three layers here these these will
correspond to the three steps that I
want you to be thinking about at the
bottom there's a storage layer data
arrive into the storage layer the
streaming database records the data
there in particular updates to the data
constant arrival of changes to the data
going on second by second
recorded data are then provided up to a
compute layer and this is where we both
compute and maintain these views as the
data change a little different than a
conventional Data Warehouse in that this
layer is much less ephemeral right a lot
of traditional dataware says queries
come they get answered and retired and
who knows you could just go to a totally
different instance next time the value
of this layer is actually in maintaining
these these views and maintaining data
index so that it's readily accessible at
very low latencies
um and sometimes it's the state held at
these these compute nodes that's
important it's
um soft State not hard state so there
isn't a complicated consistency question
going on here but the the value is that
they're holding on to something in index
representation that's ready to go this
very moment
and up top there's there's what we call
the adapter layer which is what
interacts our interfaces with SQL you
know the users show up and say I would
like the experience of interacting with
a single streaming database and it
provides the facade of that it
coordinates all the work underneath and
coordinates the interaction with each of
these SQL connections to make sure that
the system presents as if it's just one
streaming database that magically
everyone seems to be able to use in a
consistent and serializable or strictly
serializable way
we're going to break this down with a
bit more detail now this is the exciting
slide coming up it's not very different
but
ta-da so
this is and we're going to say because
this is really like the punch not the
punch but uh virtual time is a concept
got introduced back in the 1980s 1985
paper by Jefferson
many of you may not know what it is
that's totally fine uh not all of us are
doing computer science back in in the
80s you may know of it as event time uh
it's very analogous to event time let me
explain
so back in in 85 was actually proposed
as a database concurrency control
mechanism and the idea was that all
interactions with uh events outside the
system so in this case data updates SQL
commands should have timestamps attached
to the virtual time stamps attached to
them
and then the systems job now
in essence is to behave like a like a
simulator to say like well let's imagine
that those events did happen at exactly
those moments
what should the right answer be like if
if we had some updates at various times
and then someone showed up and said I'd
like to see the answer to my query at a
very particular time there's a right
answer
um Can the system go and compute this
now as quickly as possible Just Produce
that that right answer for us
uh and what this does which is really
nice is that uh taking this approach of
great let's time stamp everything let's
compute correct answers at each of these
layers
it provides a really nice boundary
between each of these these layers where
storage
assigns virtual times to all these
updates and then its job is to surface
them upwards and consistently repeatedly
durably say here are time stamped
updates for you the compute layer and
potentially the adapter layer
the compute layer consumes timestamped
updates and its job is to pretend as if
all of these views updated
instantaneously so to produce output
updates because timestamps correspond
exactly to the timestamps and the inputs
to the to the Views and are maintained
sort of imperfect correspondence again
as if in Virtual time they update
instantaneously
and then finally the adapter layer
sequences essentially puts timestamps
onto all of the SQL commands that come
through to provide the apparent
experience of
a sequential total order the timestamps
are the things that provide the total
order and a little bit of a little bit
of finesse the adapter layer is actually
able to provide even more
Advanced properties than just a total
order on all of these events
all right why do this I mean there's
lots of different ways that you could go
and try to build concurrency control
throughout a complicated system the
decoupling is really helpful so the fact
that we've we're able to design and
Implement each of these three layers
differently is really valuable the
techniques that you use in each of these
layers very different and the basis on
which each of them are correct or
performance very different techniques
and allowing each of each of them to be
implemented separately just very
powerful you can take experts in each of
these domains throw them at the problem
and the storage folks will use very
different techniques to ensure
durability than the compute folks will
use to get performance than the adapter
folks will use to get the appearance of
of consistency
what's especially nice about virtual
times though is that while they
coordinate the results while they make
sure that things happen at the same time
they don't actually synchronize them so
they don't force the execution to occur
at exactly the same moments the
execution of all these components are
decoupled they happen as fast as they
can happen which is great you know
everyone the three different sources can
all operate at whatever rate they're
getting data no one has to wait for
anyone else
the compute layer if several views are
being updated concurrently they update
as fast as they can update and
it's the availability of data of updates
at these virtual times that move the
system forward if you're up at the
adapter layer and you ask about
something that's good to go you get your
answer right away and if you ask about a
view that is on fire because uh you did
some horrible cross join you don't get
an answer but you haven't interfered
with the rest of the system either
so it's a really powerful decoupler both
from an architecture point of view and
also from a performance point of view
at the same time at the end of the day
you get the experience of a single
timeline where all of the events in the
system happen on that timeline and as if
there was just one thread running the
entire system for you but it's all of
course it's all big lie but but that's
the experience that you get
so really we'll have this light up a few
more times because it's such an
important slide
I'm going to talk a little bit about
utilities to give you a taste of what
goes on in each of them
um
they're complicated I said there would
be three challenging they are indeed
challenging and and smart people are
hard at work at each of these layers as
we speak
but uh
I hope to convince you that they're
they're tractable you can you can work
on them and continue to make progress on
them
um
uh fortunately without inheriting any of
the complexity of the other layers
so the storage layer the main challenge
here is durability folks show up with
updates from outside the system and the
storage layer needs to make sure to
write them down and be able to reproduce
them exactly as recorded uh arbitrarily
far into the future did arrive as change
data capture streams you can think of
this as potentially division a postgres
replication log where is Kafka
representations basically folks show up
and explain how their data change in a
way that is meant to be unambiguous and
we uh we write it down as we see it when
we do that though we have some very
opinionated things that we do we put a
timestamp on each of these updates the
timestamps have to be carefully chosen
in that if data for example updated in a
transaction all those updates seem to
have exactly the same virtual timestamp
very important that
uh atomicity basically says I should not
be able to see half of a transaction so
they happen exactly the same virtual
timestamp
some other ordering constraints
but moreover
we uh so we've Journal a picture down
here of of one of these one of these
timelines a Time varying collection is
the name that we use for them you can
see a bunch of update events that go on
on this timeline time goes from left to
the right
but there's some other important moments
here the boundaries of this this green
box are important you're also able to
get out of
this information to Frontiers there's a
right Frontier which is sort of the
Leading Edge of where we're currently
writing data down things at the right
Frontier so it's a Time updates at the
right Frontier are not yet known the
data may still change at that time and
Beyond in the future so it's important
for example if you're thinking what's
the right answer to know that I couldn't
quite tell you just yet if it's that
time or or above
similarly there's a read Frontier that
sort of trailing behind collecting up
all these updates essentially rolling
them up into a maintained snapshot for
reasons of bounded memory Footprints and
bounded storage and efficiency but this
is uh preventing you know if you allow
it but preventing access into the far
history of a collection it's sort of
saying anytime inside the green region
you can look at roll up all of the
updates there and get the current
contents of your particular collection
both those constraints in mind
durability is what the folks here work
on uh you know using exciting tools like
cockroachdb and S3 and various other
considerations make sure they did are
durable
there are other layers the compute layer
this is the one that I'm personally most
familiar with
uh is essentially the data flow
processing layer it takes time varying
collections as inputs does various bits
of transformation to them and produces
the corresponding time varying
collections as output so CDC streams in
CDC streams out that exactly correspond
to the input as if all of these updates
happened instantaneously
um
just got some some pictures you know
there's a rich variety of data flow
operators that allow us to turn any SQL
query into a nice streaming data flow
having done that there's a few different
things you can do with the results so
I've put here in purple you could have
some of these things be an index this is
a state that stays in memory the compute
layer and is randomly accessible from
the adapter layer up above it's what
gives you the millisecond time scale
access to data but you could also take
the results and write them back out to a
materialized view this create materials
used as a command that we'll see in just
a moment
puts the data back in the storage plane
which allows people in other compute
instances to pull the data in so for
example if you're the first step in the
data pipeline you're cleaning things up
you're doing some denormalization who
knows what
you might have a whole bunch of
Downstream consumers that you don't want
in your in your address but like they're
doing crazy stuff you don't want them to
crash or slow down your work which is
important to lots of people
so it makes sense to write the results
of your views out to the storage layer
and have other people bring that data
back into their compute environments
they're isolated away from yours
where major lazy would be relevant but
other than that go as fast as possible
like be as efficient displays as little
memory as possible go go fast
um
durability is not not a problem here
because of the deterministic nature of
all these operators determinism is what
provides the correctness guarantees that
we need here
all right last layer and this is a bit
of a thinky one uh the adapter layer and
fundamentally what happens here is those
SQL commands that came in need
timestamps right we've actually got 100
people connected all saying select this
insert that create whatever
we need to put timestamps on these
commands and the timestamps that we
choose for those commands are what
determine the apparent behavior of the
entire system
the people observing the system are the
folks connected through these through
these sessions
so if you just put timestamps on
commands actually a good thing happens
already you get serializable isolation
the timestamps themselves and our
correct behavior in response to them are
n order on all of the events
it's not so bad though it turns out if
you're familiar serializable isolation
has a bunch of really weird properties
that you would like no that can't be
true but
um the order doesn't need to correspond
to real time and you would really like
that in a lot of cases you really like
if you go and insert some data into a
table and then read from the table that
maybe you should definitely be seeing
what you just inserted into there
uh and to get this property into
something slightly stronger than
serializability something that's
deficient as strict serializability that
time stamps need to increase as real
time moves forward
so there's a few constraints that this
layer has in terms of installing
timestamps that's one of them strict
serializability that's some others
though just to throw up if your query
let's say it involves a table from the
storage layer and an index from the
compute layer
they're these read and write Frontiers
there and for example for attempts them
to invalid it needs to be at least as
big as all of the read Frontiers right
if that's not the case we're not sure
they'll actually give you correct data
out so you have to pick sufficiently
large timestamp to get correct results
but you might also want to be prompt
right the result might want to you know
you want to come back right away you
want to pick a timestamp that is not
great or equal to any of these right
Frontiers and if you do that you're in a
position to get your result back pretty
much right away certainly you don't have
to wait for things things to happen
you might want it to be the right end of
this timeline to get the freshest of
possible results but that's a little bit
in tension with strict serializability
which says things only have to go to the
right so if you go all the way to the
right you've ruled out a bunch of
options for the next person who asks a
question it's it's you know complicated
and challenging there's some fun
trade-offs here to provide a great
experience to everyone that uh totally
conceals the fact that that actually uh
there's 57 different computers doing
different things all at once
so their challenge here is consistency
providing the appearance of uh
consistency across a whole bunch of
fundamentally independent things
coordinated only through virtual time
all right this slide like I said um here
it is again super important
so uh you know conventional three layers
of uh of a cloud data warehouse I think
but coupled through virtual times we're
able to have things continually change
continually update without losing our
heads without without uh utterly losing
track of what's going on in the system
so I'd like to do now is show off a
little bit of the scaling aspect of the
uh the scalable Cloud native streaming
database so
there are a few things I said you're
going to want out of such a thing and
I'm going to show off three of them that
I think are are pretty cool that you
wouldn't get if you just put materialize
onto a
one computer and then sort of said good
luck enjoy
so we're gonna look at the same set of
us before the left pane over here is
going to be that same streaming
updates on that pushable cluster that
are going to show us all the changes to
outbids for buyer 500 it's really not
going to change throughout the course of
the demo so the only thing to notice on
the left side of the screen is that
stuff's happening as long as it
continues to happen you should be pretty
happy and I'm going to do most of the
action over here on the on the right
side
uh the first thing we're going to do is
just try to make a mess out of
everything so we're going to head on
over to the the pull cluster you might
remember that was the place that queries
go slow right because we have no built
indexes we're just re-running stuff from
scratch
and we're just going to start to ask a
bunch of questions over there basically
you know take many integer seconds take
answers and observe that that does not
contaminate the interactive experience
that's going on over on the left-hand
side the left-hand side still continues
to tick at the same rate even though the
right hand side has decided I have 10
seconds worth of work to go and actually
do
um you can isolate the performance
maintain low latency properties
on one and allow analytic work to occur
concurrently without
you know jumping the queue or getting in
the way of of
the interactive experiences yeah
demonstration number one ta-da all right
great no you know we'll we'll do a pause
in the uh in the question section later
the uh the other thing that we might do
and this requires just a bit of
explanation these clusters these
workspaces can be back they're backed by
compute resources of course but they can
be backed by multiple replicas of the
same uh the same work essentially
different compute resources
so what we're going to do here is take
the pushball cluster the thing that's
powering the left-hand side and give it
another replica let's give it a we're
going to scale it up is that a small
replica now we're gonna give it a medium
replica and we've gone and created that
and it just stitches itself in there's a
notice that comes back but otherwise
just stitches itself in and is
supporting the the first replica which
sorry I apologize I should have warned
you to watch for this which were disrupt
so there's only one replica running now
it's a medium replica and there was no
interruption over on the right hand side
left-hand side yeah so we just rescaled
from a small cluster to a medium cluster
while maintaining this interactive
experience with no interruption
right this is one of the really powerful
parts of compute uh
being decoupled but still coordinated
through virtual time these two
computances were doing exactly the same
thing so we could deduplicate the
results and as long as one of them is
live and running
we're getting fresh results uh spilling
out even as we fiddle around with which
one is backing at what's the scale you
know all those sorts of things
now what we're about to do is is uh drop
the medium clusters another there's no
replicas sorry behind this cluster is
not running anymore it's stopped and
uh you might say oh that sucks but but
actually a good thing is happening over
there I mean as good as it can be when
you don't have any computers to do your
work
but the the stream that's coming out has
has paused it knows that it does not
know the answer to how the data changes
in the next second so it's holding its
uh holding its fire it's not going to
tell you
yeah yeah I don't know probably nothing
happened no it's just waiting until it
actually
gets the information that it needs and
we've reprovisioned a a small replica
behind it now just to cut back over and
as soon as that comes online and gets
hydrated it'll pick up exactly where it
left off and actually backfill all of
the changes in the sort of 10 seconds or
so
uh for which we had no compute resources
so even in the case of failure or or
de-provisioning
you actually get the right information
out and it's a behavioral hiccup like
there's there's some time that passes
but you don't get the wrong data coming
back out of this so you can without
having to change your application
uh just rely on getting correct
information slower or faster based on
the machines you have
all right that was that was trick number
two
trick number three that we're going to
do is hop onto both on the push cluster
and on the pull cluster so the two not
pushable clusters create materialized
views that select out of the outbids
relation so we've got output one and
output two output one is the pull
cluster output two is from the push
cluster these are the same view computed
two different ways on two different
computers
uh uh the for the pull way it has no
indexes so it's it's pulling in all the
work and doing it doing it uh over and
over the push one is reading out of an
index and writing it back into into
storage
and what I've said hopefully you can
believe me I said these are the same and
that's what we're going to do now is
actually subscribe to the query that
says show me what's in the first one but
not the second one
and that's what's happening right now uh
you are seeing all of those changes
which isn't necessarily convincing until
I go and I put those progress statements
back in
so materialize not only doesn't show you
anything it confirms second by second
there is nothing in there there is no
millisecond at which there's a single
record in one of these relations and not
in the other one that will be true
forever these things barring bugs on our
part these things are exactly equal
virtual time is keeping them perfectly
coupled if these were two separate use
cases for example like you had one team
that was figuring out who are the
auction winners how much money should we
collect from them and another team that
was producing whose auctions have closed
we should give them some money those
numbers will always match perfectly the
uh the imbalance between those dollar
amounts will be zero for all time
and this is really powerful for teams
building low latency applications uh two
different people doing two different
things and then bringing their data
together and having the guarantee that
it is as if their work was executed
instantaneously simultaneously against
the same data
there's no more trying to figure out
whose data are slightly out of date and
and patching that up with various
bandages
great
it's a slide again uh it's the last time
last time for this slide uh just just to
recap because like I said it's super
important uh the three steps that we
have here are storage compute and
adapter and they're coupled through the
use of virtual time this is again really
powerful really not fundamentally
complicated coordination primitive that
allows us to reduce the complexity in
the overall system into challenging
steps for sure but ones that are
tractable by folks who are expert in
each of these in each of these domains
and allows you to give this experience
that I don't know increasingly I think
is really cool
just all of the really impressive stuff
that you can do spinning up whole
bunches of compute teams of people who
don't even know about other people's
existences using the results of their
their data products back through the
storage layer and not having to reason
about all the complicated stuff you'd
normally reason about in a system with
so many moving parts
uh yeah this is what I have for you so
we're gonna end here
[Music]
Weitere verwandte Videos ansehen
![](https://i.ytimg.com/vi/j09EQ-xlh88/hq720.jpg)
Learn What is Database | Types of Database | DBMS
![](https://i.ytimg.com/vi/sns5nc3IU5g/hq720.jpg)
DS201.12 Replication | Foundations of Apache Cassandra
![](https://i.ytimg.com/vi/eQ3eNd5WbH8/hq720.jpg)
How indexes work in Distributed Databases, their trade-offs, and challenges
![](https://i.ytimg.com/vi/FIPCDRRBGz4/hq720.jpg)
Intro to Replication - Systems Design "Need to Knows" | Systems Design 0 to 1 with Ex-Google SWE
![](https://i.ytimg.com/vi/9mdadNspP_M/hq720.jpg)
Which Database Model to Choose?
![](https://i.ytimg.com/vi/a-K2C3sf1_Q/hq720.jpg)
The Problem With UUIDs
5.0 / 5 (0 votes)