DynamoDB: Under the hood, managing throughput, advanced design patterns | Jason Hunter | AWS Events
Summary
TLDRThis video script delves into the inner workings of Amazon DynamoDB, exploring its partitioning scheme, data replication across availability zones, and the mechanics of read and write operations. It discusses consistency models, Global Secondary Indexes, and throughput capacity, offering practical use cases and design patterns for optimizing NoSQL database performance. The talk also covers advanced features like DAX, transactions, parallel scans, and the new Standard-Infrequent Access table class for cost-effective data storage.
Takeaways
- 🔍 **Deep Dive into DynamoDB**: The talk focuses on understanding the inner workings of Amazon DynamoDB, exploring how it delivers value and operates under the hood.
- 📦 **Data Partitioning**: DynamoDB uses a partitioning system where data is distributed across multiple partitions based on the hash of the partition key, ensuring efficient data retrieval and storage.
- 🔑 **Consistent Hashing**: The hashing of partition keys determines the allocation of items to specific partitions, which is crucial for data distribution and load balancing.
- 💾 **Physical Storage**: Behind the logical partitions, DynamoDB replicates data across multiple servers and availability zones to ensure high availability and durability.
- 🌐 **DynamoDB's Backend Infrastructure**: The service utilizes a large number of request routers and storage nodes that communicate via heartbeats to maintain data consistency and handle leader election.
- 🔍 **GetItem Operations**: Retrieving items can be done through either strong consistency, which always goes to the leader node, or eventual consistency, which can use any available node.
- 🔄 **Global Secondary Indexes (GSIs)**: GSIs are implemented as separate tables that require their own provisioned capacity and can be used to efficiently query data based on non-primary key attributes.
- ⚡ **Performance Considerations**: The design of the table schema, such as the use of sort keys and GSIs, can significantly impact performance and cost, especially for large-scale databases.
- 💡 **Optimizing Data Access**: The script suggests using hierarchical sort keys and sparse GSIs to optimize access patterns and reduce costs, catering to different query requirements.
- 🛠️ **Handling High Traffic**: For extremely high-traffic scenarios, such as during Amazon Prime Day, DynamoDB Accelerator (DAX) can be used to cache and serve popular items quickly.
- 🔄 **Auto Scaling and Partition Splitting**: DynamoDB automatically scales and splits partitions to handle increased traffic and data size, ensuring consistent performance without manual intervention.
Q & A
What is the main focus of the second part of the DynamoDB talk?
-The main focus of the second part of the DynamoDB talk is to explore how DynamoDB operates under the hood, providing insights into the database's internal workings and explaining the value delivered in the first part.
How does DynamoDB handle partitioning of data?
-DynamoDB uses a partition key to hash data items and distribute them across different physical partitions. Each partition is responsible for a section of the key space and can store an arbitrary number of items.
What is the purpose of hashing in DynamoDB's partitioning scheme?
-Hashing is used to determine the partition in which a data item should be stored. It takes the partition key, runs it through a mathematical process, and produces a fixed-length output that falls within a specific partition's range.
How does DynamoDB ensure high availability and durability of data?
-DynamoDB replicates each partition across multiple availability zones and hosts. This replication ensures that if an issue occurs with an availability zone or a host, there are still working copies of the data, making DynamoDB resilient to failures.
What is the difference between strong consistency and eventual consistency in DynamoDB reads?
-Strong consistency ensures that the most recent data is always read by directing the request to the leader node. Eventual consistency, on the other hand, can direct the request to any of the nodes, which may result in slightly stale data but at a lower cost.
How are Global Secondary Indexes (GSIs) implemented in DynamoDB?
-GSIs are implemented as a separate table that is automatically maintained by DynamoDB. A log propagator moves data from the base table to the GSI, and GSIs have their own provisioned capacity or can be in on-demand mode.
What are Read Capacity Units (RCUs) and Write Capacity Units (WCUs) in DynamoDB?
-RCUs and WCUs are the units of measure for the throughput of DynamoDB tables. RCUs represent the capacity to read data, while WCUs represent the capacity to write data. They can be provisioned or on-demand, with different pricing and performance implications.
What is the maximum size of an item that can be stored in a DynamoDB table?
-The maximum size of an item in a DynamoDB table is 400 KB. If more data needs to be stored, it should be split into multiple items.
How does DynamoDB handle scaling and partition splitting?
-DynamoDB scales by partitioning. Each physical partition supports up to 1,000 WCUs or 3,000 RCUs per second. If a partition's size grows beyond 10 gigabytes or experiences high traffic, it will automatically split to maintain performance.
What is the role of the burst bucket in DynamoDB's provisioned capacity mode?
-The burst bucket in DynamoDB's provisioned capacity mode allows for temporary spikes in traffic above the provisioned capacity by storing unused capacity tokens from previous time intervals, enabling higher throughput for short durations without incurring additional costs.
What is the significance of having a good dispersion of partition keys in DynamoDB?
-A good dispersion of partition keys ensures that the read and write load is evenly distributed across different storage nodes, preventing any single node from becoming a bottleneck and maintaining overall performance and scalability.
How can DynamoDB's auto scaling feature be beneficial for handling varying workloads?
-Auto scaling in DynamoDB adjusts the provisioned capacity up and down based on actual traffic, within specified minimum and maximum limits. This ensures that the performance is maintained without the need for manual intervention and can handle varying workloads efficiently.
What is the purpose of using a hierarchical sort key in DynamoDB?
-A hierarchical sort key in DynamoDB allows for querying data at different granularities. It enables efficient retrieval of related items, such as all offices in a specific country or city, by structuring the sort key to reflect the hierarchy of the data.
Can you provide an example of how to optimize data storage for a shopping cart in DynamoDB?
-Instead of storing the entire shopping cart as a single large item, it's more optimal to store each attribute, such as cart items, address, and order history, as separate items within the same partition key. This approach allows for more efficient retrieval, update, and cost management.
What is a sparse Global Secondary Index (GSI) in DynamoDB?
-A sparse GSI in DynamoDB is a design pattern where the GSI is not populated with every item from the base table. It's used for specific access patterns where only a subset of items are relevant, reducing the cost and storage overhead of the GSI.
How can DynamoDB Streams be utilized to update real-time aggregations?
-DynamoDB Streams can trigger a Lambda function upon a mutation event, such as an insert or update. The Lambda can then perform actions like incrementing a count attribute, providing real-time aggregations that can be queried efficiently.
What is the significance of the Time-To-Live (TTL) feature in DynamoDB?
-The TTL feature in DynamoDB allows items to be automatically deleted after a specified duration, without incurring any write capacity unit (WCU) charges. This is useful for data that is meant to be temporary, such as session data, reducing storage costs for expired data.
What is the DynamoDB Standard-Infrequent Access (Standard-IA) table class, and how does it differ from the standard table class?
-The Standard-IA table class is a cost-effective option for storing data that is infrequently accessed. It offers 60% lower storage costs compared to the standard table class but with a 25% increase in retrieval costs. There is no performance trade-off, making it a suitable choice for large tables where storage cost reduction is beneficial.
How can the new DynamoDB feature introduced in November 2021 help in managing costs for large tables?
-The introduction of the Standard-Infrequent Access (Standard-IA) table class in November 2021 allows for significant cost savings on storage for large tables that are infrequently accessed. It provides a 60% reduction in storage costs while maintaining the same performance levels as the standard table class.
Outlines
🚀 Introduction to DynamoDB's Internals
The speaker welcomes the audience to the second part of the DynamoDB discussion, expressing enthusiasm for exploring the database's operations. The session aims to clarify how the value discussed in the first part is delivered. The talk will include a deep dive into DynamoDB's functioning, addressing partitioning, hashing of partition keys, and data storage in physical partitions. It will also cover server replication for data resilience, request routing, and the application of learned techniques to solve complex problems presented as puzzlers.
🔑 Understanding DynamoDB's Partitioning and Replication
This paragraph delves into the mechanics of DynamoDB's partitioning system, explaining how data is distributed across different partitions based on hashed partition keys. It details the replication process across multiple availability zones for fault tolerance and data integrity. The explanation includes how write and read operations are handled, the concept of leaders and followers in data replication, and the use of the Paxos algorithm for leader election. The paragraph also discusses different read consistency models—strong consistency and eventual consistency—and their impact on performance and cost.
📚 Deep Dive into Storage Nodes and Request Handling
The speaker provides an in-depth look at the backend processes of DynamoDB, focusing on storage nodes and how they communicate through heartbeats to maintain health checks and leader election. The paragraph explains the write process to the leader node and the subsequent propagation to follower nodes. It also covers the retrieval of items through the load balancer and request router, highlighting the distinction between strong and eventual consistency reads. The implementation of Global Secondary Indexes (GSIs) as separate tables with their own capacity is also discussed, along with the impact of GSIs on throughput during data operations.
🔄 Throughput Management and Auto-Scaling in DynamoDB
This section discusses the management of throughput in DynamoDB, explaining the difference between on-demand and provisioned capacity modes. It describes how Read Capacity Units (RCUs) and Write Capacity Units (WCUs) function in both modes, and the importance of a good partition key dispersion to avoid throttling. The paragraph also covers the auto-scaling feature, which adjusts capacity based on live traffic, and the use of burst buckets to handle temporary traffic spikes without exceeding provisioned capacity.
🌐 Auto-Administration and Partition Splitting in DynamoDB
The speaker explains the auto-administration feature of DynamoDB, which automatically splits partitions when they receive excessive traffic or grow beyond a certain size. This process is designed to maintain performance and is carried out without user intervention. The paragraph also introduces contributor insights, a tool that helps monitor hot keys and throttled items, providing visibility into the database's partition access patterns and load distribution.
🛍️ Optimizing Data Storage for a Shopping Cart Use Case
The paragraph presents a real-life example of a shopping cart stored in DynamoDB, discussing the limitations of storing a shopping cart as a single data item due to the 400 KB size limit and the inability to perform index-driven retrievals on nested attributes. It suggests an optimized approach where different attributes of a user are stored as separate items within the same collection, each with a unique sort key. This method improves performance, reduces costs, and allows for more granular updates and retrievals.
🔍 Efficient Data Retrieval Using Indexes and GSIs
The speaker discusses the importance of using indexes and Global Secondary Indexes (GSIs) for efficient data retrieval. The paragraph explains how to optimize queries by using the sort key and how GSIs can be used to retrieve data based on different attributes. It also covers the concept of a sparse GSI, which is useful for specific access patterns where the attribute may not always be present, and the benefits of hierarchical sort keys for querying different granularities of data.
🎉 Handling High Traffic Scenarios with Partitioning and DAX
This section addresses strategies for handling high traffic and large-scale data scenarios in DynamoDB. It introduces the concept of partitioning to distribute write load across multiple partitions and the use of DynamoDB Accelerator (DAX) for managing read-heavy loads with in-memory caching. The paragraph also discusses the use of multiple GSIs to handle large volumes of events and the importance of designing data models that avoid hot partition keys.
🔐 Transactions, Parallel Scans, and Stream Processing
The paragraph covers advanced features of DynamoDB, including transactions that ensure atomic operations across multiple items, parallel scans that allow for efficient and rapid scanning of the table using multiple threads, and stream processing that enables real-time monitoring of table changes. It also discusses the use of Lambda functions to perform actions based on stream events, such as updating aggregations or triggering notifications.
⏱️ Time-To-Live and Cost-Effective Data Management
The speaker introduces the Time-To-Live (TTL) feature, which allows for the automatic deletion of expired items in the background without incurring write costs. This feature is beneficial for data that has a natural expiration, such as session data. The paragraph also highlights the cost savings achieved by using TTL, especially for large tables where delete operations were a significant portion of the workload.
🌐 Introducing DynamoDB Standard-Infrequent Access
The final paragraph introduces a new feature, DynamoDB Standard-Infrequent Access (Standard-IA), designed to reduce storage costs for data that is infrequently accessed. It explains the cost benefits of Standard-IA, which offers 60% lower storage costs compared to the standard table class, without compromising performance or availability. The speaker also suggests strategies for determining whether Standard-IA is suitable for a particular use case, such as analyzing cost structures and considering table partitioning based on access patterns.
Mindmap
Keywords
💡DynamoDB
💡Partition Key
💡Hash Function
💡Global Secondary Indexes (GSIs)
💡Provisioned Capacity
💡On-Demand Capacity
💡Read Capacity Units (RCUs)
💡Write Capacity Units (WCUs)
💡Auto Scaling
💡Strong Consistency
💡Eventual Consistency
💡DynamoDB Accelerator (DAX)
💡Transactions
💡Parallel Scan
💡Time-To-Live (TTL)
💡Standard-Infrequent Access (Standard-IA)
Highlights
Introduction to the second part of a DynamoDB deep dive, focusing on its internal operations and practical use case scenarios.
Explanation of how DynamoDB uses partition keys and hash functions to distribute data across physical partitions for efficient storage and retrieval.
The concept of data replication across multiple availability zones in DynamoDB for high availability and fault tolerance.
DynamoDB's architecture involving load balancers, request routers, and storage nodes, and how they interact to handle data requests.
The difference between strong consistency and eventual consistency in DynamoDB reads, and their impact on performance and cost.
How Global Secondary Indexes (GSIs) are implemented as separate tables with their own provisioned capacity in DynamoDB.
The distinction between provisioned and on-demand capacity modes in DynamoDB, and their implications for performance scaling and cost management.
The importance of partition key design for achieving even data distribution and avoiding hotspots in DynamoDB.
Auto-scaling feature in DynamoDB that adjusts provisioned capacity based on real-time traffic without manual intervention.
DynamoDB's burst bucket mechanism that allows temporary spikes in traffic above the provisioned capacity by reusing unused capacity.
The use of transactions in DynamoDB to ensure atomicity across multiple items, with the consideration of increased write costs.
Optimization strategies for handling high-traffic scenarios, such as distributing writes across multiple partitions to avoid throttling.
The introduction of DynamoDB Accelerator (DAX) as an in-memory cache to offload traffic for read-heavy workloads and improve performance.
Practical examples of DynamoDB use cases, including shopping cart storage optimization and device log management.
Design patterns for efficient querying in DynamoDB, such as using hierarchical sort keys and sparse Global Secondary Indexes.
Advanced topics like DynamoDB Streams for real-time event watching and triggering actions, and Time-To-Live (TTL) feature for automatic data expiration.
Announcement and explanation of the new DynamoDB Standard-Infrequent Access (IA) table class for cost-effective storage of less frequently accessed data.
Recommendations for choosing between Standard and Standard-IA table classes based on storage and throughput costs analysis.
Transcripts
(bright music)
- Hello, and welcome to part two of our DynamoDB talk.
Thanks for joining me back.
Hopefully, you either liked the first part,
and wanna join in, or you're an expert,
who already knew everything in the first part
and wanna dig a little deeper with me.
In this talk, we're going to look under the hood
of how DynamoDB actually operates,
and get a sense of when I explain the value
in the first part, how is that value actually delivered?
So this is my favorite kind of stuff,
how does a database really work underneath,
so hope you enjoy the ride with me.
We'll end with some kind of puzzlers.
Here's a use case.
What would you do about that use case?
We get to apply what we learned,
and learn some new techniques
for being able to handle problems
that maybe aren't obvious at the beginning
what the solution is.
All right, let's dig under the hood.
In this case, we have three items
that we need to put inside of a DynamoDB table.
They have an OrderId.
That's their partition key.
They have other attributes,
which are in a sense just payload
when it comes to the partitioning aspects of DynamoDB.
On the right-hand side, we have a DynamoDB table,
a logical representation of our orders.
Inside of a table, there are partitions.
I'll oftentimes call 'em physical partitions.
Sometimes, you hear virtual partitions.
This is the actual bucket in which the data goes,
and each one is responsible for a section of the key space.
So Partition A here is responsible for 00 to 55.
So it's kind of like housing addresses between 00
and 55 go to this partition.
Partition B is 55 to AA,
and Partition C is AA to FF.
In reality, these are longer than two digits,
but let's, for simplicity, we'll just do two digits here.
So now let's look at an OrderId 1.
Where should we put it?
What you do is you take the partition key,
which is sometimes you see it called a hash key,
and this is why.
We hash it.
Remember a hash, you take an arbitrary input value.
You run it through a mathematical process,
and you get out a fixed length string, or value,
MD5, for example, SHA-256, things like that.
So you run a hash on this,
and if we assume in our hash function a 1 input produces
a 7B output every time, then we can say, all right,
well, where should this item go?
Well, it should go into Partition B,
because 7B is between 55 and AA in hex.
All right, OrderId 2, we're going to hash the 2.
Produces a 48.
It goes into Partition A,
because 48 is between 00 and 55.
And then we'll hash 3, which equals CD in this case,
and it goes into Partition C.
Okay, so this is how it works.
Now, each physical partition can store
an arbitrary number of items.
It doesn't just store, for example,
the first partition there doesn't only store OrderId 2.
It stores any OrderId that hashes between 00 and 55.
And as is common with NoSQL databases,
all the data items together,
all those attributes are stored continuously,
so that they're easy to retrieve
as a singular unit, right?
So it's not scattering all the attributes out
into different locations that has to be joined back.
The items are stored contiguously.
So that's the logical representation.
Let's look a little bit more at what is really going on,
because on the backend there are servers.
And so, when we hash 1,
it goes to Partition B.
How's Partition B really represented?
We have multiple servers within the AWS region
that this table is a part of,
and each partition is going to be on a particular host,
and there are multiple availability zones,
three availability zones responsible for every table.
So in this case, Order 1 is hashed to Partition B,
which can go to three different availability zones,
each Partition B.
So it goes on Host 2, Host 5, and Host 8
replicated across all three.
And why do we replicate?
We replicate because should there be an issue
with an availability zone, or with Host 2, or 5, or 8,
we still have two working copies,
and you don't notice as the user of DynamoDB.
It's resilient to the failures.
So now, let's step back to about a 10,000 foot,
and understand what's really going on,
because behind the scenes, there aren't just three servers
responsible in each availability zone.
There's actually thousands.
So you as the user on the left make a request.
It goes to a load balancer.
That gets sent to a request router,
and there are thousands of request routers in each AZ.
That request router knows for each request coming in,
because of the partition key, sometimes called a hash key,
that it goes to a certain storage node.
There are thousands of storage nodes on the right,
and it will route that request to the right storage node,
and because there are three storage nodes involved
in every particular item,
it will go to the other storage nodes
in the other availability zones.
The storage nodes communicate with each other,
doing a heartbeat that goes on continuously,
saying, "Hey, are you still there?
"Are you still healthy?"
And also, "Who's the leader?"
Because when you do a write, you always write to the leader,
and the other nodes are followers.
When you do a request, do an insert,
it goes to the leader, the leader writes it,
it propagates that to the other storage nodes.
When one of them replies that they've got it,
then you can get the response back acknowledged.
You don't have to wait for the third one.
That's a performance improvement,
but you always have it written to at least two locations,
and if one of the machines should die,
we use a Paxos algorithm to elect a new leader,
or actually, periodically we elect new leaders.
Makes sense?
So how do you get an item?
When you do a GetItem, it goes, again,
through the load balancer to the request router,
and that request router knows which storage nodes
are responsible for that particular item
by looking at the partition key and knowing the metadata.
All right, so now, there are two different ways
to do a request.
Remember, you could do strong consistency,
or eventual consistency.
So here's what's really going on when you do that.
When you say strong consistency,
it always goes to the leader,
because the leader always has the very latest data.
If you say eventual consistency,
then it can pick any of the nodes to go to.
Statistically, 2/3 of the time,
you will be getting the very latest version of that item,
but potentially, you could be going to that third node
that is a millisecond behind the other two,
and that's why it's eventually consistent.
It is going to get that data item,
but you might ask so quickly that the data hadn't propagated
from the leader to the follower, and that's the,
I'd say downside of eventually consistent,
but the advantage is that it's half the cost, right?
You use half as many RCUs for an eventually consistent read
as a strongly consistent read.
What's nice is when you do a strongly consistent read,
you're not polling numerous machines.
You're just going to the leader,
so it's still very fast when you do
a strongly consistent read.
We talked in the first section about GSIs,
Global Secondary Indexes.
So now, let's look at how they're actually implemented.
They're essentially a second table
that is automatically maintained by the DynamoDB system
with a log propagator that knows how to move data
from the base table into the GSI table.
And this is why GSIs actually have
their own provisioned capacity,
and you can put them in on-demand mode,
and things like that.
So when you do an insert to some date item
in the base table, the log propagator says,
"All right, is that relevant for my GSI?"
Not every item will have the attributes
that are responsible in the GSI,
might not have the partition key, and sort key.
And if so, nothing gets propagated,
which is an interesting way for the GSI
to have less throughput load than the base table,
because if it's a sparse GSI,
and I'll give you some examples later,
maybe the partition key needed for the GSI doesn't exist
in the attribute, in the item as an attribute,
and therefore doesn't propagate.
There are also cases, though,
where the GSI might get more throughput,
because if you in the base table update the attribute
that is used as the partition key in the GSI,
then you have to delete one partition,
and insert into another partition.
So this is an amplification effect,
where if you're continuously changing the partition key
used by the GSI in the base table,
even if it's not the same partition key
as in the base table, you're going to get a delete
and an insert in the GSI.
So be aware that the GSI in some cases will have
less throughput than the base table.
In some cases, it might have more throughput.
That's why it's independently provisioned.
Let's look next at the throughput.
I covered this in part one.
Remember provisioned capacity and on-demand capacity?
Let's look a little bit about what's going on
on the backend when you make these choices.
So throughput, it comes in Read Capacity Units
and Write Capacity Units,
shorthand, RCU and WCU.
In on-demand mode, a Read Capacity Unit is a 4 K request.
If you do a request for 7 K to retrieve, that's two RCUs.
If you do 100 K, that's 25 RCUs.
And Write Capacity Units, if you write a 10 KB item,
that's 10 Write Capacity Units in on-demand.
When you're in provisioned mode, it's a little different.
They're called the same thing, but the units are different.
It's that amount per second.
So if you provision at 1,000 RCUs per second,
what you're basically saying is every second,
I expect to be consuming about 1,000 RCUs,
and over time, so long as you stay around there,
that's what the table's going to give you,
similar with Write Capacity Units.
So it can be confusing for people to think,
"Well, isn't WCU a singular request, or is it a rate?"
And it depends on your mode.
In on-demand, it's a singular request.
With provisioned, it's a rate,
and they are independent.
So a table can have really high read, low write,
or the reverse, or high, or low of both.
Now, remember, eventually consistent reads are
at half the rate.
So if you do a 100 KB item retrieval eventually consistent,
then instead of 25 RCUs, it would be half that.
There are some other things to think about.
The max size of an item in a table is 400 KB.
That is somewhat
to keep you using DynamoDB
the way it was intended to be used.
It's not meant to store megabytes of data
as a singular item.
So 400 KB real limit.
If you wanna store more than that,
you wanna split your item into multiple pieces,
and store them as separate items.
And on the backend, scaling is achieved with partitioning.
We just reviewed the physical partitioning.
Each one of those virtual partitions,
or physical partitions, supports 1,000 WCUs per second,
or 3,000 RCUs per second, or a mix.
So if you're using 500 WCUs,
then you'd get 1,500 RCUs, for example.
It's one, or the other, or a mix of the two,
and you know, when you start getting hot,
it starts splitting for you.
So if your traffic grows, it can split.
When your size gets large,
specifically when a partition under whatever partition keys
sort into that physical partition,
when it gets bigger than 10 gigabytes,
it'll split that into two different partitions.
You get to pick your capacity mode
when you create the table.
On-demand, really straightforward.
With provisioned, I would recommend you turn on auto-scaling
in almost all cases.
What this says is instead of saying
I want a particular level steady,
you say please notice the live traffic
that is going on with the table right now,
and adjust up and down accordingly.
So you can specify the minimum, don't go below this,
the maximum, don't go above this, and a target utilization.
And so, this is a representation
of how the provisioned capacity is enforcing those limits.
So step one, you make a request over the network
to a load balancer, which gets sent to a request router.
The request router authenticates.
That's number three on the slide.
It looks at the partition metadata system to understand
the topology of where that partition key
would actually be assigned to a storage node.
But then before sending to the storage node,
it keeps track of how much traffic you've been sending,
and it has basically a bucket of tokens.
So every time you do a 20 RCU request, or a 20 WCU request,
it subtracts that from the table bucket,
and that bucket fills every second
with whatever your provisioned capacity is.
So if you've provisioned 1,000, it fills with 1,000 tokens,
and every second you can be pulling out.
So you pull out 20, you pull out 50, you pull out one,
you pull out 50, you pull out 100,
so long as you have tokens, it lets the traffic go forward
onto the storage note.
All right, what happens if you ask
for more than the provisioned capacity?
So you've provisioned 1,000, and for a little while,
you'd like to drive 2,000.
That's actually okay for short durations,
thanks to a burst bucket.
The burst bucket takes any tokens that weren't used
during a particular second, and keeps those for you,
kind of like roll-over minutes on old style cellphone plans,
and it is able to keep track of all the unused capacity
over the last five minutes.
So if you are running at exactly your provisioned capacity,
the burst bucket would be empty,
but typically, you're running below capacity,
especially if you've set utilization at 70%.
Then you've got a certain amount of unused tokens
from the table bucket that can go into the burst bucket.
That's what allows temporary spikes
above the provisioned capacity, and that's just fine.
It's still allowed forward to the storage node.
There is another set of buckets.
Each particular storage node has its own bucket,
which is enforcing that 3,000 RCU, 1,000 WCU limit.
All right?
So just because the table has capacity,
if you've only used one partition key for the table,
and all traffic is going to the same item,
it's still going to be throttled on the storage node side,
because of the buckets that are associated
with that particular storage node.
So that's a good explanation, I think,
of why you want to have a pretty good dispersion
of partition keys.
You want to have a good number of storage nodes involved
with your table, not too few.
In on-demand capacity mode,
this is what you set up when you say,
"You know, my traffic isn't really smooth.
"I might go down to zero for a while.
"I might spike up rapidly.
"I might be doing a bulk load without any advance notice,
"and I just want the table to not enforce
"the table bucket and the burst bucket."
And so, with on-demand, those buckets don't exist.
There's no limit at that level saying that,
"Oh, this request shouldn't go forward."
There is, however, still the limit
on each particular storage node.
So that's, again, why it's important to have
good dispersion of your partition keys.
But in this case, it's really easy, set and forget.
You just say this table,
I don't know what the traffic will be.
If a request comes in,
I want you to do your best to process it,
and so long as you have good partition key dispersion,
it'll work great.
When you create a new table in DynamoDB
and put it into on-demand mode,
it has to have a certain expectation of how much capacity
it should provision on the backend,
and this is what it provisions.
It expects up to 4,000 write request units,
which would be 4,000 writes per second,
or up to 12,000 read request units,
which, if it's eventually consistent,
would be 24,000 consistent reads per second,
or any combination of the two.
That's just the default throughput.
If you send traffic, and your traffic is greater than that,
it will adjust the table up,
and in fact, there's no maximum throughput.
You can send, you know, and people do,
million write request units per second,
multi-million request units per second.
If you remember in the first section, amazon.com was sending
over 80 million requests per second to DynamoDB,
and an on-demand table could definitely support that.
It won't support it out-of-the-box necessarily,
but there is a way to do it.
If you provision the table to a certain level above this,
it will have the table able to support
that level of traffic, and then if you switch to on-demand,
it will keep that ability.
If you don't do that, it will always keep track
of what the request rate is,
and here we see a synthetic amount of requests per second,
and each time there's a new peak,
the table on the backend will auto-adjust.
So, oh, there's a lot of traffic.
Let me grow the table to twice that.
Oh, new peak, let me grow the table to twice that.
Auto admin on the backend is noticing the traffic,
and continuously adjusting the table's capabilities,
so that you're not close to hitting the max
that the table can support,
always up to twice the previous peak.
And in fact, these on-demand tables don't scale down.
So if you hit a peak,
and then you go silent over the weekend,
when you come back Monday morning with a lot of traffic,
it'll still be capable of supporting that level of traffic
that had been the previous peak.
And in fact, that red line should probably be
double the green line, right?
Because we just learned twice the previous peak.
So which one do you wanna pick?
You know, you can actually pick one, and then change,
but in advance, provisioned mode is when you have
steady workloads, you have an expectation
that I have kind of a sine wave throughout the day.
It doesn't grow from 10 to 1,000 in a second.
I have events maybe where I know the traffic is coming,
and I wanna provision at that certain level.
On-demand is when you don't know in advance what's coming.
Maybe you open up a new region,
and you're not sure what the traffic will be.
Go in on-demand mode, DynamoDB will handle the traffic
that's coming at it.
When it's idle, if it's completely idle,
there is no throughput cost.
It's great when you don't know what's coming.
You just set it and forget it.
What we see on this slide is a certain partition,
our favorite Partition A,
and A is interesting, because it has two different items
that are getting quite a lot of traffic,
item foo and bar.
If it gets too much traffic, either write traffic,
or read traffic, we know that there's a certain amount
of capacity that a certain partition can support.
So what should we do?
What should auto admin do?
What auto admin does is it splits the partition,
sometimes in half, not always perfectly in half,
and will try to separate the two hot items,
so that item foo is in a different partition than item bar,
and this happens in the background.
You don't see it.
You're not aware of it,
except it's happening to make sure
that your DynamoDB performance is always maintained.
And in fact, that foo item is still really red.
And so, eventually, auto admin might separate foo
into its own partition.
One partition might, at the end of the day,
be one singular item if it gets quite a lot of traffic.
This is always done automatically in the background,
detecting what's the traffic, should there be more splits?
If you're curious what's going on with hotkeys,
you can turn on contributor insights,
just a push button in the console,
and you get a screen like this.
It shows on the left the partition keys
that are being most accessed.
Top left is partition keys most accessed.
Top right is the partition key, sort key combination
that is most accessed.
On the bottom left, you can see the most throttled items,
so this is the partition keys that are most often throttled.
Throttling is what you would get
when a partition is getting more traffic
than its limits of 3,000 RCUs, 1,000 WCUs.
But you see how the throttling very quickly goes away?
That's because auto admin's doing the job on the backend,
saying, "Whoa, this is very hot, let me split it.
"This is still hot, let me split it,"
and then very quickly, no more throttling.
And on the bottom right you see
the most throttled particular items
of the partition key, sort key combination.
So this one item was throttled,
and then it was no more throttling.
Why? Because it was the foo in this case.
It was the one that was a particularly hot individual item,
probably went into its own physical partition.
At this point, you've seen what's going on underneath
the covers of DynamoDB, and let's put it into practice.
Let's look at some real-life situations,
and think how could this be improved?
What's the best approach?
Here's a real-life example.
This is a shopping cart that was being stored
inside DynamoDB, and it was stored as a singular data item,
basically just a JSON document held in DynamoDB.
And that's okay, you can do that,
but it's not the most optimal way to store the data,
and why not?
Well, you only get 400 kilobytes per item.
And so, putting everything into the same item
starts to eventually grow beyond the 400 KB limit.
You can't query on any of these nested attributes.
You can filter on them,
but you can't put them into the sort key
to be able to do index-driven retrievals.
So it starts to be just a very simple key value store,
as opposed to the full power of DynamoDB,
where you can query and retrieve based on attributes
that are held.
But I think maybe even more so is that anytime you change
an item, anytime you update an item,
the cost of that is the larger of the before, or after size.
So if you update a 300 KB item and make it a 350 KB item,
you're charged, in this case, 350 write units,
Write Capacity Units.
That's quite a lot if you're only updating
a fraction of the item.
So what's a better way to do this?
Look at this. This is a screenshot from NoSQL Workbench.
I have as a partition key the certain user ID,
and I've separated out all the different attributes
about this user into different individual items
in the same item collection of the user,
and in the sort key, the category of the data is prepended
to the particular unique value.
So I have their address.
I have items in their cart.
I have their order history,
their profile name, the store that they prefer.
What's nice about this, instead of one singular large item,
I can retrieve all their cart items
by saying give me this user ID,
all the sort key that begins with Cart#.
I can retrieve their address with a singular
give me the sort key that starts Address#.
I can update the address, and only update the address item,
not update their cart, and their order history,
and their profile name, and all the rest.
So this is better performance,
better selectivity, lower cost.
That's why you see this whole general pattern
of a customer oftentimes as a partition key
and different aspects of that customer
that are remembered are in the sort key,
oftentimes prepended with the category of the data.
All right, here, we have a device log.
The DeviceID is the partition key.
And for the sort key, we have the Date
of some sort of warning, or state adjustment here.
This is pretty good.
We like the high cardinality of the DeviceID.
We can scale to an unlimited number of devices.
All right, so what do we wanna retrieve?
Let's say we want to retrieve all warnings.
Here's pseudocode of what it would look like.
Select everything from DeviceLog,
WHERE the device is equal to whatever,
and FILTER ON warning, WARNING1.
The bottom left is the command line invocation
that makes this happen.
So you say aws dynamodb query.
You specify the table-name as DeviceLog,
and you do a key-condition-expression,
where the dID is equal to dID.
Now, this is the first time I've shown this hash and colon.
It's for substitution,
because the IDs might have special characters.
You do the substitution where you say #dID,
and expression-attribute-names #dID is equal to DeviceID.
:dID is substituted as d#12345
under expression-attribute-values, okay?
So we're saying DeviceID is equal to d#12345.
filter-expression #s, which is State,
is equal to :s, which is WARNING1.
And no-scan-index-forward is a way of saying scan backwards.
So this is a way to specify
that what the pseudocode SQL on the top left says,
and say go to this, find me where the dID is equal
to DeviceID of 12345,
and the State is WARNING1.
All right, sounds pretty good.
I'm filtering for the state, however,
which means that when I do the request, on the backend,
the state retrieval is not index optimized.
Is that an issue?
Well, a little bit,
because if there's 99% of the time normal,
1% of the time warning, then I'm going to be filtering away
99% of the items I'd like to look at,
which will mean that I'm basically expending 100X more RCUs
than I would otherwise want to do.
So what I'd really like is for that state
to be index-driven.
If we wanted index-driven, we'd put it into the sort key,
the common pattern.
So what we've done here is instead of a sort key
of simply the date,
the sort key is the State# and then the Date.
Now, we've adjusted the command line
to say look at the DeviceLog backwards.
Find me under key expression
where the DeviceID is 12345,
and the State#Date, as the name, is equal to WARNING1.
This is completely index optimized.
If there was one warning out of 10,000 normal,
it will retrieve just that singular item,
because it's index optimized to find sort keys
that begin with a certain value.
So this is a good design pattern,
because it reduces cost and improves performance.
All right, let's continue with the puzzlers.
We have the same data here, now with the new sort key.
What we want is to fetch all the device logs
for a given operator between two dates.
And we see that the Operator is an attribute.
We have Liz and Sue.
Hmm.
Well, what's the right way to, for a given table,
where you want to retrieve the data using
different attributes than the base table has
in the partition key and the sort key?
We use a GSI.
All right, so all device logs for given operator
between two dates.
We want to have a GSI,
where Operator would be the partition key,
and the Date would be the sort key,
so that I can retrieve by dates.
So here's my partition key.
Here's my GSI representation inside NoSQL Workbench,
and the sort key here is Date,
and this is very straightforward then to find
all device logs for a certain operator between two dates,
because I can do a sort key index-driven between two dates.
All right, good use of a GSI.
Here's what the CLI would look like.
You notice that we specify the table-name, DeviceLog,
and the index-name as well, because the index is associated
with the table, but has its own name.
So indexes in DynamoDB aren't implicitly used.
They're explicitly used.
You say, "Get this from this index."
And so, against that index,
you look for the key-condition-expression
where the Operator is Liz,
and the Date is between March 20th and 25th.
Index-driven, very efficient.
All right, new access pattern,
fetch all escalated device logs for a given supervisor.
All right, what we see here, supervisor is Sarah.
But only sometimes.
It's not always present.
Not every one is escalated.
Does that matter?
Can I still do a GSI if the attributes don't exist?
Yes, you can, and in fact, it's called a sparse GSI.
It's a great pattern when you have
a needle in a haystack type query.
It's, when you create a sparse GSI,
if the item in the base table doesn't match the needs
for that GSI, it's not propagated.
Therefore, there's no cost of the reads,
or writes for the GSI.
Well, no writes, no write cost for the GSI,
and there's no storage cost on the GSI.
So this is a very low cost GSI to maintain,
and yet it's really much better than doing,
say, a scan against the whole table to try to find
this needle in the haystack type query.
So good thing to remember, sparse GSI.
There's no formal definition of a sparse GSI.
It's just something we think of.
It's a GSI where I don't expect most items to match,
and then we can consider it a sparse GSI
just as nomenclature to share with each other,
the general design pattern.
And then here's an example from Amazon's Phone Tool.
Phone Tool is how you can look up another employee,
and see where their office is, what their time zone is,
what their phone number is, and things like that,
and it shows a common pattern of making
a hierarchical sort key,
where the partition key in this case is Country,
and the sort key is the state hash the airport code hash
the office location.
So this would be how you'd look up metadata about
every office inside of Amazon.
And why would you do this?
Because it's very easy, therefore, to do a query that says,
"Give me all the offices in the USA,"
'cause you just do a query where you don't specify
a sort key condition, and you specify the Country.
You can say, "Give me all the New York locations,"
by saying the sort key has to start with NY#,
all the New York City locations,
where the sort key starts with NY#NYC#,
or just a particular item by adding in the airport code
that we use for, for example, JFK14,
which is a certain office building.
So this hierarchical data in the sort key
is a common pattern.
It lets you query different granularities.
Would this be a good design for public-facing access?
Actually not, and the reason is that
that partition key of Country,
I expect that is going to be a very hot partition key,
and you generally want to create designs
where you don't have too much going on
under a certain partition key
relative to the whole table access.
So you might wanna do something where the Country is
USA-NY, USA-WA as the partition key.
The pros and cons of that,
you have more dispersion of your partition key,
but if you did want every USA office,
then you would have to do a BatchGetItem, or a BatchQuery,
which goes and retrieves all the items
for every particular state,
and aggregate it yourself in your code.
All right, at this point,
we've covered sort of the common scenarios.
Now, let's get into some
that are a little bit more interesting and unique,
a little bit more likely to be stumpers,
even for people who have used DynamoDB for a little while.
The first one I'll call "American Idol."
In this case, we want to keep track of votes
for certain numbers of candidates.
The natural way to represent that is as shown here
in the screenshot that we have a certain candidate
with a certain number of votes,
and the top two candidates get most of the votes.
So Candidate A gets a bunch of votes,
Candidate B gets a bunch of votes.
The challenge with this design is
that there's that single item limit of 1,000 WCUs,
which means we would get
1,000 writes per second per candidate.
I think during "American Idol" final week
we were getting more votes than that per second.
So what do you do about it?
Hmm.
I'll let you think for a second.
All right, that's long enough.
Let's assume that we have 20,000 votes going to Candidate A
and 10,000 votes going to Candidate B.
Maybe that's a good hint.
What will we do?
What we will do is create different partitions
for each candidate, so that instead of having
one giant ballot box per candidate,
we will separate, and give maybe 20,
or 50 ballot boxes per candidate,
and people can go to that particular ballot box randomly,
and place their vote in that ballot box.
Maybe another analogy is a buffet line at a conference.
I hate it when they have one buffet line.
Let's do 10 buffet lines, much shorter lines.
I get my food faster.
That's kind of what we're doing here.
So when you update an item,
you just randomly pick the partition,
Candidate A 1, 2, 3, 4,
and you add your vote there.
Each one of those can get 1,000 WCUs,
and therefore with 20, you could have 20,000 WCUs.
With 100, you would have 100,000 WCUs, right?
So one vote goes here.
A vote goes there.
Vote, vote, vote, vote, vote, vote.
At some point, maybe periodically, every second, at the end,
you do an aggregation.
So you do a parallel collection.
You go to each of these partitions.
You go a GetItem on the votes,
and you insert as Candidate A total
the total number of votes.
That's what your user interface,
that's what your application goes against.
So you keep track of how many people
went through the buffet line by keeping a separate counter
per buffet line, and keeping one vote in one,
one metric in one location for the number of people
that have enjoyed the beautiful food at the conference,
and similar with Candidate B.
Okay, so in the application, just retrieve from the total.
Your application doesn't have to be aware of the fact
that there are actually multiple ballot boxes
on the backend.
Here's a puzzler.
All right, I've got different events.
Each event is the partition key
and each event gets a timestamp.
These are possibly events that need to be handled,
and I need to query for find me all the events
that are still in the database
that are older than four hours.
All right, well, this partition key is good,
because it's great dispersion.
Every event gets its own partition key.
That's fabulous.
The timestamp is underneath, good,
but I need to fetch all the events older than four hours.
So what do I do with that?
Well, I think your first instinct is create a GSI,
and that's good, but the natural way to create the GSI
would be to have some partition key,
which aggregates all of the timestamps underneath it,
to make it easy to fetch events
that are older than four hours, and that'll work.
It'll only work up to a certain limit,
because, again, the GSI has the same 1,000 WCU limits.
And so, that would limit how many events.
Now, you know, a small scale database,
in fact, even a normal scale database,
this wouldn't be an issue.
But we're thinking massive scale here.
So how do you do this when the number of events
is potentially, you know, multi-million, or a billion?
All right, what do you do with that?
What you do is the same idea I just discussed,
where you have a GSI,
and you put all the timestamps
under that same partition key,
except in this case, we create more than one partition key,
as many as you need, N many.
So GSI-PartitionKey 1, 2, and 3.
What you're doing here is creating three, or N, or 50,
or 100, however many different separate GSIs,
separate item collections,
each of which will keep its own sorted list of timestamps,
and then you can go to each of those individual items,
and get the ones that are older than four hours,
and aggregate it together, right?
So you insert the event with a timestamp,
and all you have to do is pick a random number
between zero and N, and say,
"That's gonna be my GSI-PartitionKey."
And then this is the GSI point of view.
You will have a bunch of different partitions,
as many as you picked,
each of which has its own set of timestamps,
and then you can do a direct query against that, and say,
"Find me those which have a timestamp
"older than four hours."
Do that in parallel.
When it finds one, it alerts, and says,
"I found it, here you go," okay?
So again, this is what you do when you have
scale above typical.
It's a good technique.
Those were all write examples,
where we were thinking about WCU limits.
Here's a read example.
This is, say, Amazon Prime Day,
and different items have different popularity
in the product catalog.
The Instant Pot, that is the hit of the year,
70,000 requests per second,
need to retrieve the metadata about the Instant Pot.
"Childhood's End," science fiction on the right,
a lot less traffic.
What do you do if you need to get the same item
70,000 times per second?
Is your instinct to do a read shard?
It's almost a trick question.
That's not what you have to do.
Here's a typical access pattern of almost anything.
What is this, the Pareto curve?
Some items get most of the traffic.
DAX is the solution.
Remember DAX, the DynamoDB accelerator?
This is an in-memory cache that can go in front of DynamoDB,
and when you do a GetItem against it,
it can do millions per second.
So it's the same API as DynamoDB itself.
If you do a GetItem, and it doesn't have it in its cache,
it will retrieve it from DynamoDB, and send it on,
and remember it for the next request.
If you do a PutItem, you write an item,
it'll write it, and remember for the next request as well.
So it's a great way to offload traffic for hot items,
and you just put as many servers.
This is not serverless.
This is, there are servers.
You can pick as many servers up to 10 read replicas
in front of DynamoDB.
So that's something to think about when you think,
"You know what, this item,
"even if it's in its own physical partition,
"I'm going to need to do more than 3,000 RCUs
"against the same individual item."
In-memory cache will rescue.
All right, that's enough puzzlers.
Let me give you a couple just more advanced topics
I wanted to cover today.
So transactions we did mention before.
The way that you implement that is you issue
a TransactWriteItems command.
It batches up a set of items,
and says, "I want these done atomically."
And you can, you know, do a put in there,
an update in there, a delete in there,
any number of those together
up to 25 items within a transaction.
And you can do it across tables,
and you can do it in some conditional checks as well
that would say if this condition fails,
I want this transaction to end.
Why not do everything with a TransactWriteItems?
Well, it consumes 2X the WCU
as if it weren't a TransactWriteItems, okay?
So that's basically because there's a two-phase commit,
so it's literally twice as much work,
therefore twice as much cost.
So this is good when you have to make changes across items,
or things like that.
You say this needs to happen either together,
or not at all.
Side note, individual item updates are always transactional.
This is only when you need to do updates across items.
So if you have multiple different clients competing
to update the same item, it is always serialized.
This is for cross-item transactions.
And here's an example.
This is a game state, where Hammer57 is going to buy
health with coins, and we need to make sure
that if we deduct the coins, we add the health.
You don't wanna do this halfway.
So this is what would be sent over the wire essentially.
You say update the Gamers table,
where the GamerID is Hammer57.
I wanna set the health to 100.
All right, but also update the Gamers table
where the partition key is Hammer57,
and I want to say set the coins equal to coins minus 400.
So subtract 400 coins, add 100 health,
or set 100 health, not add,
but there is a condition expression.
Only do it if they have enough coins,
which is a good way, good technique to remember.
So if they have at least 400 coins,
and that probably should've been
a greater than, or equal, shouldn't it?
If they have at least 400 coins,
subtract the coins, and add the health.
And now, you've seen the bug.
If they have exactly 400 coins, they can't buy health.
They'd need at least 401, due to the condition expression
being a greater than instead of greater than equals.
This is a good use case for transactions.
Another thing to keep in mind,
let's say you have found a case
where you need to read all the items from the table
as quickly as possible.
You have a certain analytical style query
that you just wanna drive.
You wanna say, "Give me all the offices
"from that Phone Tool example,
"and I don't want it limited by country, or anything."
I just wanna list of all the offices.
Just scan it.
One way to do it, you'd do a scan,
and you retrieve it serially,
but if you want it fast, you probably want a parallel scan.
The way a parallel scan works is you specify
on the scan request a total segment count that you want.
So 10 threads, 100 threads, you name it.
You say, "I want 100, and I am worker five,"
and I will get allocated as worker five that 1% of the table
and every other worker gets there 1% of the table.
So here's all the data items.
Remember, the 00 to FF in the key space.
I say four segments here, so it's 0 through 3.
My main thread creates four workers.
Every worker says, "Hey, please do this request.
"I want four segments, and I am Worker 0, 1, 2, 3,"
and they can in parallel process the key space.
They each get their own portion
of the key space process through.
With this, you can scan a table
almost as quickly as you need, right?
So don't think you can only scan with one thread.
You can scan with N many threads.
Something else that's important to know about DynamoDB is
that you do have a stream capability.
This live watches for events
that happen on the DynamoDB table,
and when there's a mutation event, it gets updated.
And with that, you can invoke a Lambda,
and make some choices.
What do you wanna do when you notice that the table moves?
Well, one way to, one common usage pattern for this is
to update the table with some real-time aggregations.
So you wanna keep a count of how many Xs a user has.
Well, the code can just insert the item,
and then the Lambda notices that you did the insert,
and it will increment by one the number of items
that are under there, so like, you know,
how many bug reports are on this code base?
Just add the bug report.
Have the Lambda insert the count,
so that later on, you can say,
"Well, which items have the most bug reports,"
and make that an efficient query.
That's a good example.
You can also feed OpenSearch.
So hey, every time there's a change,
put that change into OpenSearch,
so that you can have a search,
text search oriented version of the data inside DynamoDB,
or you can, for example, feed it to Kinesis Firehose,
which can write to S3 in Parquet format,
and you can run Athena to do a deep analytical query
against the data, without actually having
to do an export in this case.
You can keep it real-time updated based
on the mutation events that are happening inside DynamoDB.
So something to keep in mind,
that you can watch for the events in DynamoDB,
or anything else you wanna do.
Email somebody.
"Hey, we noticed that you just created an account, welcome,"
and you can fire off that, put it into a queue
based on a Lambda that notices when new users are added.
And in fact, there's a feature added in November of 2021,
which was that you can filter to have
only certain items invoke the Lambda.
It used to be the Lambda would be invoked,
and the Lambda could decide if it cared.
Now, you can have a pre-filter,
where the Lambda won't even be invoked
if the item doesn't match the constraints.
So you can watch for just inserts of new users,
and not any other kind of inserts,
and have that particular Lambda be the one
that fires off the welcome message.
Something else to keep in mind as a feature is TTL,
Time-To-Live.
If you create an item in DynamoDB,
you can create an attribute,
and tell the database this attribute is my TTL attribute,
and its seconds since epoch.
Once the time has moved forward from that point,
it is eligible for deletion.
What's cool is that deletion happens in the background
without incurring any WCU charges.
So it's a way to freely delete the data in the database
if it's naturally going to expire,
like, say, it's a session.
You might say this session is good for three days,
and then after three days, just let the database delete it.
It's not guaranteed to delete that second.
It generally gets deleted as a background process
within 48 hours of that second.
And I mention it here with streams,
because something that's a common pattern is
to have streams watch for those deletes,
and possibly export that data,
say, to S3 for posterity.
All right, it's in the database.
Three days later, I'm gonna let it delete.
When it deletes, I'll put a copy in S3
just to remember that it once existed.
And because it doesn't incur WCUs,
this is a real-life example,
where TTL was enabled on a table that was,
you see the WCUs, very large table,
but a lot of it was deletes.
So as soon as the TTL went on,
then the delete stopped consuming WCUs,
and we were pleased that the customer could save
quite a lot of cost using the TTL feature.
All right, so here's what we've covered.
We've come a long way.
High SLA, DAX, global tables.
I won't read it all.
If you have questions, I'll stick around for questions
at the end of this,
but there is one more thing I wanted to cover,
and that is a new feature introduced in November of 2021,
DynamoDB Standard-Infrequent Access table type, table class,
and this is a great way to reduce costs.
So the reason this was introduced is that, you know,
you, as you add more data to a DynamoDB table,
the older data, it stays,
but it maybe isn't as valuable to you,
because you don't access it as much.
Yet, the cost from a storage,
a monthly per gigabyte storage, remains.
So people were saying, you know,
"Is there a way to keep the new data readily available,
"but the old data available as well,
"but maybe reduce my cost, so that I don't have to pay
"as much on a per gigabyte basis for the data
"that's a little bit more archive-oriented?"
And you see this in a bunch of different situations.
Social media, you've posted last year.
You wanna keep it.
It's great for your application if you don't have to,
you know, what did people do?
They would maybe export the old data to S3,
but then the application would have to have a path
where the new data would be available in DynamoDB,
and the old data pulled out of S3 with a higher latency,
or data analytics similarly.
The recent data, commonly used,
do they wanna pull the old data out of S3,
or would you rather keep the old data in DynamoDB
readily available at a lower cost?
Or retail, you know,
Amazon remembers all your old orders,
and it's kind of a fun game to see
who made their first amazon.com order, and what was it.
And so, you can go into your own order history,
and see back, for me, I think it was about '97, or '96,
and like what was the first thing I bought?
It was a book, of course.
So what's the right way on the backend to keep this data?
Do you want to export it?
Most customers have told us, you know,
they'd like to keep the data in DynamoDB,
make it as quick to access old data as new data,
but come up with a way where that old data
could be more cost-effectively stored.
And so, for that was introduced this feature,
which is the Standard-Infrequent Access table class,
same name as you see in S3,
where a similar design pattern is available
for objects that are infrequently accessed.
It is 60% lower cost on a per gigabyte basis
for the table than the standard.
It's a good savings for large tables,
and there's not a performance trade-off.
It is just as fast to retrieve any particular item,
to do a query.
It's just as durable, just as available.
You won't notice any different at the application level.
It's only a billing-oriented change,
and you as a developer,
you don't have to make any code changes.
You click in the console, you issue a CLI, do an SDK change.
You just say, "Please change this table
"from one to the other," and it will do it.
The trade-off is that the storage costs are lower,
but as with S3 Infrequent Access,
the retrieval is a little bit elevated,
25% more on a throughput cost.
So therefore it's a mathematical game
of is this table large enough
where the 60% storage savings is going to benefit me?
And so, majority of cases,
we think the Standard table access is going to be
the right choice.
So when throughput is the dominant cost, keep Standard.
But when the data is large,
you may wanna look at the Standard-IA table class,
because then you get a 60% reduction in the storage costs,
and that can be an actual large reduction
in the overall cost that you're having
on this DynamoDB table.
It is not on a per item basis.
It is on a per table basis.
This is a situation where you also might wanna think about
what if I had maybe a table per month,
and I converted the older months into Standard-IA,
and kept the newer, the latest month,
or the last three months, or something, in Standard?
And so, if you have a model where you can create
maybe a table per month,
and your application can handle that,
this is a way to save 60% on the storage cost of that,
those older tables that are rarely accessed.
To pick which one's right for your use case,
we do have an open source application
that you can run that looks at your history,
and will give you some advice.
But basically, what you need to do is this.
You log in to the Cost and Usage Reports,
or the AWS Cost Explorer,
and look at your table cost structure.
It's good to tag each table with its name,
which lets you isolate to that particular table.
And if the storage cost exceeds 50% of the throughput cost,
Standard-IA will probably be cost-effective for you,
and that's just mathematical.
So it's not 50% of the overall cost.
It's actually 50% of the throughput cost.
So if the throughput cost is a dollar,
and your storage cost is 50 cents,
it's probably good to switch to Standard-IA.
And that's because the storage is reduced by 60%,
and the throughput is a 25% uplift,
and you can get started today.
It's just as easy as a button click.
You can also turn it on for a day, and see what the cost is,
and compare to the previous day.
So if your traffic is the same on both days,
experimentation is one way
to just actually look at the cost between the two.
And you can switch back and forth as well as needed.
If you say, "You know what, I wanna switch back,"
you can do that, too.
And so, with that, I'll end this.
Thank you so much for coming,
and I will stick around for any questions.
Weitere verwandte Videos ansehen
![](https://i.ytimg.com/vi/9deAV0yAkH8/hq720.jpg)
Improve DynamoDB Performance with DAX
![](https://i.ytimg.com/vi/ESbgV5IsUa4/hqdefault.jpg?sqp=-oaymwEXCJADEOABSFryq4qpAwkIARUAAIhCGAE=&rs=AOn4CLCSMFXW8hXjGWBNuImpZFA4CFpseg)
#2 How to PASS exam MLS-C01 AWS Certified Machine Learning Specialty in 14 hours | Part 2
![](https://i.ytimg.com/vi/6GebEqt6Ynk/hq720.jpg)
Choosing a Database for Systems Design: All you need to know in one video
![](https://i.ytimg.com/vi/eQ3eNd5WbH8/hq720.jpg)
How indexes work in Distributed Databases, their trade-offs, and challenges
![](https://i.ytimg.com/vi/DTUSFVfjRQA/hq720.jpg)
Google SWE teaches systems design | EP27: Search Indexes
![](https://i.ytimg.com/vi/R0fqpQd9mSM/hq720.jpg?v=6622156c)
Amazon S3 Explained in 10 Minutes
5.0 / 5 (0 votes)