Types of Databases: Relational vs. Columnar vs. Document vs. Graph vs. Vector vs. Key-value & more

Anton Putra
28 Dec 202318:23

Summary

TLDRThis script offers an insightful overview of various database types, their use cases, and examples of both open-source and cloud-managed options. It explains relational databases for structured data, columnar databases for big data analytics, document databases for flexible schema, graph databases for complex relationships, time-series databases for chronological data, vector databases for AI and ML, and key-value stores for rapid data retrieval. The video script emphasizes the importance of selecting the right database for optimal performance and cost-effectiveness.

Takeaways

  • 🗃️ Relational databases organize data in tables with rows and columns, using primary and foreign keys to link data across tables.
  • 🔍 SQL is the query language used for relational databases, allowing for complex joins and data retrieval.
  • 🔄 ACID transactions in relational databases ensure that all changes are treated as a single operation, maintaining data integrity.
  • 📊 Columnar databases are optimized for big data scenarios, reading data by column rather than row to improve query performance.
  • 📚 Document databases, like MongoDB, store data in JSON-like documents, offering flexible schemas and ease of development.
  • 📈 Graph databases excel at representing and querying complex relationships between data points, using nodes and edges.
  • 📈 Vector databases store data as high-dimensional vectors, enabling similarity searches in AI and ML applications.
  • 🔑 Key-value databases, such as etcd, store data in key-value pairs, providing fast data access and horizontal scalability.
  • ⏱ Time series databases are specialized for handling time-stamped data, with efficient storage and retrieval for analytics.
  • 🛠️ Different database types serve different purposes, and choosing the right one can significantly impact application performance and cost.
  • 🔬 The script provides a comprehensive overview of various database types, their use cases, and examples of open-source and cloud-managed databases.

Q & A

  • What is a relational database and how was it originally developed?

    -A relational database is a collection of spreadsheet files that help businesses organize, manage, and relate data. It was originally developed by IBM in the 1970s.

  • How are data organized in a relational database model?

    -In the relational database model, data is organized into tables that store information as columns (attributes) and rows (records or tuples). Columns specify data types, and each record contains values corresponding to those data types.

  • What is a primary key in a relational database?

    -A primary key in a relational database is a unique identifier for each row, ensuring that each record can be distinctly recognized within the table.

  • Can you explain the concept of 'foreign key' in relational databases?

    -A foreign key in a relational database refers to the primary key in another table, allowing rows from different tables to be linked together.

  • What is an ACID transaction in the context of relational databases?

    -ACID transactions in relational databases mean that all changes to data are performed as if they are a single operation. If at least one task fails, the whole transaction is rolled back to maintain data integrity.

  • What are some examples of relational databases?

    -Examples of relational databases include MySQL, PostgreSQL, MariaDB, Microsoft SQL Server, and Oracle Database.

  • Why were columnar databases developed and how do they differ from traditional databases?

    -Columnar databases were developed to improve query performance with big data. They differ from traditional databases by reading data from top to bottom and only reading the columns that are needed for a query, which can be more efficient with large datasets.

  • What is a document database and how does it differ from a relational database?

    -A document database is used to store and query data in JSON-like documents. It differs from a relational database by allowing for flexible, semi-structured, and hierarchical data storage without a predefined schema.

  • How does a graph database represent data and relationships?

    -A graph database represents data as nodes, which are records, and relationships as edges connecting these nodes. Relationships can have a direction and properties associated with them, allowing for complex queries about connections between data points.

  • What is a vector database and how does it enable unique search experiences?

    -A vector database stores data as high-dimensional vectors, where each vector represents an asset with a certain number of dimensions. It enables unique search experiences by finding similar assets through vector search methods, such as image or document similarity searches.

  • What are some advantages of key-value databases?

    -Key-value databases offer advantages such as scalability, as they can distribute data across servers; ease of use, as they follow the object-oriented paradigm; and performance, as they do not require resource-intensive table joins like relational databases.

  • What is a time series database and what type of data is it optimized for?

    -A time series database is optimized for time-stamped or time series data, which includes measurements or events tracked and monitored over time, such as server metrics, sensor data, or financial trades.

Outlines

00:00

🗃️ Database Selection and Relational Databases

This paragraph introduces the importance of selecting the right database for an application's performance and outlines the use cases for different database types. It focuses on relational databases, which are structured collections of data in table format, with each table having a primary key and the ability to link tables through foreign keys. The paragraph explains how relational databases use SQL for querying and support ACID transactions for reliable data operations. Examples of popular relational databases like MySQL, PostgreSQL, and Oracle are provided, along with cloud-managed options such as Amazon Aurora.

05:03

📊 Columnar Databases for Big Data Performance

The second paragraph delves into columnar databases, which are optimized for handling large datasets by storing data column-wise, allowing for faster querying of specific columns. It contrasts the traditional row-wise data retrieval with the columnar approach, demonstrating how the latter can significantly speed up data retrieval for large datasets. The paragraph also touches on the use of algorithms in columnar databases to simplify data retrieval and mentions Apache Cassandra as a widely adopted open-source columnar database.

10:06

📄 Document Databases for Flexible Data Storage

This section discusses document databases, which are ideal for storing and managing semi-structured data with varying attributes, such as e-commerce products. Document databases allow for the storage of each product with its attributes in a single document, enabling easy updates and flexible schema design. The paragraph explains the JSON-like structure of document databases and their operations, such as create, read, update, and delete. It highlights the ease of development with document databases, their flexible schema, and performance at scale, mentioning MongoDB as a popular open-source document database.

15:09

🔍 Exploring Graph and Vector Databases for Complex Relationships and AI

The fourth paragraph introduces graph databases, which excel at representing and querying complex relationships between data points, such as social connections or academic lineage. It contrasts graph databases with relational databases, emphasizing the former's ability to handle complex queries involving distant relationships. The paragraph also introduces vector databases, which store data as high-dimensional vectors for applications like image or document similarity searches. It mentions the use of vector search methods in AI and machine learning scenarios and suggests ArangoDB as an open-source graph database and milvus for vector databases.

🔑 Key-Value Databases for Fast Data Access

This section covers key-value databases, which store data as pairs where keys are unique identifiers and values can be of any type. It highlights the scalability, ease of use, and performance advantages of key-value databases, which do not require complex joins like relational databases. The paragraph discusses use cases such as session management, shopping cart storage, metadata storage, and caching, and recommends etcd as a reliable key-value database used by Kubernetes for cluster data storage.

🕒 Time Series Databases for Sequential Data Tracking

The final paragraph focuses on time series databases, designed for handling time-stamped data such as server metrics, sensor readings, and market trades. It explains how these databases optimize data retrieval and provide a cost-effective storage solution for sequential data. The paragraph mentions InfluxDB as a general-purpose metrics database and Prometheus for collecting time series data from infrastructure, emphasizing their importance in analytics and monitoring.

Mindmap

Keywords

💡Relational Database

A relational database is a type of database that stores data in a structured format, using rows and columns in tables. It is foundational to the video's theme as it introduces the concept of how businesses can organize and relate data through attributes and records. The script provides an example of a customer table and an event table, where a foreign key in the event table links to the primary key in the customer table, illustrating the relational aspect.

💡Primary Key

A primary key is a unique identifier for each row in a relational database table. It is essential for ensuring data integrity and is a fundamental concept in the video's explanation of how relational databases work. The script mentions that each table has a primary key, which is used to uniquely identify each record.

💡Foreign Key

A foreign key is a field in a database table that is the primary key in another table, creating a link between the two tables. It is a key concept in the video as it demonstrates how relational databases can establish relationships between different sets of data. The script uses the example of linking an event to a customer through a foreign key.

💡SQL

SQL, or Structured Query Language, is used for managing and querying data in relational databases. It is central to the video's narrative as it shows how developers can interact with databases. The script includes an example of an SQL query to retrieve all customers and another to get all active customers using a join operation.

💡ACID Transactions

ACID transactions refer to a set of properties that ensure that database transactions are processed reliably. The acronym stands for Atomicity, Consistency, Isolation, and Durability. It is a critical feature highlighted in the video for maintaining data integrity, especially in operations like fund transfers where a rollback may be necessary if any part of the transaction fails.

💡Columnar Database

A columnar database is an alternative to traditional row-oriented databases, optimized for analytical queries on large datasets by storing data column by column. The video script explains how this type of database can improve query performance with big data by reading only the necessary columns, providing an example of retrieving paying customers more efficiently.

💡Document Database

A document database is a type of database designed to store, retrieve, and manage document-oriented information, such as JSON documents. The video emphasizes its utility for storing unstructured data with varying attributes, such as product information in an e-commerce application, where each product can be stored as a separate document with its unique set of attributes.

💡JSON

JSON, or JavaScript Object Notation, is a lightweight data-interchange format that is easy for humans to read and write and for machines to parse and generate. In the context of the video, JSON is used to illustrate the flexible, hierarchical structure of documents in document databases, allowing for a variety of data types to be stored within a single document.

💡Graph Database

A graph database is a type of database that uses graph structures with nodes, edges, and properties to store and manage data. The video script introduces this concept by contrasting it with relational databases, highlighting its strength in handling complex relationships and querying distant connections, such as social networks or organizational hierarchies.

💡Key-Value Database

A key-value database, as discussed in the video, stores data as a collection of key-value pairs, where the key is a unique identifier and the value can be any data type. The video mentions its advantages in terms of scalability, ease of use, and performance, and provides an example of its use in session management for web applications.

💡Time Series Database

A time series database is optimized for storing and retrieving time-stamped data, which can include metrics like server performance, sensor readings, or market trades. The video script positions this type of database as a cost-effective solution for handling time series data, offering fast data retrieval and aggregation capabilities.

Highlights

Performance of applications often depends on the selected database, emphasizing the importance of choosing the right database for specific tasks.

Relational databases, developed by IBM in the 1970s, are structured as tables with columns and rows, and use primary and foreign keys for data linkage.

SQL is used for querying relational databases, and ACID transactions ensure data integrity by rolling back changes if any part of a transaction fails.

Examples of relational databases include MySQL, PostgreSQL, and Oracle Database, with cloud-managed options like Amazon Aurora and Azure SQL.

Columnar databases improve query performance with big data by reading data column-wise rather than row-wise, reducing the amount of data processed.

Apache Cassandra is a popular open-source columnar database used for highly aggregated analysis in large enterprises.

Document databases, such as MongoDB, are ideal for e-commerce applications with varying product attributes, offering flexibility and ease of development.

JSON documents in document databases allow for flexible, semi-structured data storage with key-value pairs, arrays, and nested objects.

Graph databases excel at querying complex relationships and are useful for data science communities to test inferences, with ArangoDB as a suggested open-source option.

Vector databases store data as high-dimensional vectors, enabling the search for similar assets based on encoded data, with Milvus as an example of an open-source vector database.

Key-value databases offer scalability, ease of use, and high performance, with etcd being a reliable option used by Kubernetes for storing cluster data.

Time series databases are optimized for time-stamped data, providing fast retrieval and cost-effective storage, with InfluxDB and Prometheus as notable examples.

Each database type has specific use cases and advantages, from relational databases for structured data to key-value databases for fast data access.

The choice of database can significantly impact application performance, with the wrong choice potentially leading to slow queries and high costs.

The video provides an educational resource for learning about different database types and their practical applications in various scenarios.

Understanding the unique features and use cases of each database type is crucial for developers to make informed decisions for their applications.

The video concludes with a call to action for viewers to explore further resources on relational databases and SQL joins for a deeper understanding.

Transcripts

play00:00

Performance of your application frequently depends  

play00:02

on the database you select for  a specific task. As a developer,  

play00:06

you may need to choose the appropriate database  based on your use case and requirements.

play00:12

In this video, we'll go through the  most common database types used in  

play00:16

enterprises. I'll provide use cases  when to choose a specific database,  

play00:20

as well as examples of open-source  and cloud-managed databases.

play00:25

Let's start with the relational database.  Originally, it was developed by IBM in the  

play00:31

1970s. You can think of a relational database  as a collection of spreadsheet files that help  

play00:37

businesses organize, manage, and relate data. In  the relational database model, each 'spreadsheet'  

play00:43

is a table that stores information, represented  as columns (attributes) and rows (records or  

play00:50

tuples). Now, columns specify data types, and each  record (or row) contains values corresponding to  

play00:58

those specific data types. For example, in the  customer table, you may have an integer type  

play01:03

for the ID column and varchar for the name  and address. Also, in a relational database,  

play01:09

each table has a unique identifier for each row  called the "primary key." Rows from different  

play01:15

tables can be linked using a "foreign key," which  refers to the primary key in another table. Let’s  

play01:21

take a look at how the relational database  model works in practice. Let's say we have  

play01:26

a website and track every single action that  a user may do. A user can click on a button,  

play01:32

add an item to the shopping cart, and things like  that. Now, in a very simplified version, we would  

play01:38

have a customer table and an event table. Each  time a customer does something on the website,  

play01:44

we would store it in the event table and use a  foreign key to link this event to the customer.  

play01:50

Also, we can use SQL to query our database.  For example, you can use the following query  

play01:55

to get all the customers you have. Now, if  you want to get all customers who are active,  

play02:01

meaning they have performed at least one action,  you can use the following join. I have another  

play02:06

video explaining all the different joins as  well. Another important feature provided by  

play02:11

most relational databases is ACID transactions.  This means all changes to data are performed as  

play02:19

if they are a single operation. If at least one  task fails, the whole transaction is rolled back.  

play02:25

For example, consider a transfer of funds from one  bank account to another. First, you need to check  

play02:32

if there is enough money to transfer, then you  would debit one account and credit another. So,  

play02:38

if at least one step fails, you want to roll  back the entire transaction. Now, you would  

play02:44

use a relational database for structured data  and ACID compliance. Structured data means that  

play02:51

you have rows and columns that clearly define data  attributes. Some examples of relational databases  

play02:58

are MySQL, PostgreSQL, MariaDB, Microsoft SQL  Server, and Oracle Database. There are also  

play03:05

some cloud-managed databases available, such  as Amazon Aurora, Azure SQL, and many others.

play03:12

Now, traditional databases do not scale very  well. If you have a lot of data, like billions  

play03:17

of records, traditional databases can be slow.  To improve query performance with big data,  

play03:23

columnar databases have been developed. Let’s  go through the simple exercise, which will help  

play03:28

you see the power of using a columnar database  for certain tasks. To illustrate this exercise,  

play03:34

I’m going to use a spreadsheet populated  with some customer data. First of all,  

play03:38

let's see how a traditional database works. Before  we start, there is a rule that we need to follow:  

play03:44

we have to read the data from left to right,  starting from the beginning of each row,  

play03:50

and proceed from row to row, kind of like reading  a book. So, let’s start. Let's ask the traditional  

play03:56

database to return all paying customers. Here it  goes: as you read the data, notice that you're  

play04:02

retrieving the Name and Tier on each pass of every  row, and eventually, you'll acquire all the Names  

play04:09

and their associated Tiers. Now, servers do this  very quickly; however, what if my request involves  

play04:16

several billion rows? You can start to see that  reading each row just to grab the Name and Tier  

play04:22

columns could take a while. And this is where  a columnar database comes in handy. We’re going  

play04:28

to redo the exercise one more time, but instead  of using traditional database rules, I’m going  

play04:33

to give you columnar database rules. This means  you’re going to read the data from top to bottom,  

play04:40

and you’re only going to read the columns that I  ask you to. Now, let's ask the columnar database  

play04:46

the same question: to return all paying customers. Let’s start. Notice that this method skips all  

play04:52

the data that isn’t related to what we’re looking  for. Once we’ve finished getting that information,  

play04:58

we start with the next column, skipping the  columns that we don’t need. But wait a second,  

play05:02

how does the computer know which Tier to  assign to each Customer name it’s retrieving?  

play05:08

The way it does this is by having the Columnar  Database assign a number to each row of data,  

play05:14

allowing it to quickly pair up the different  columns it retrieves. You can see how this really  

play05:19

comes in handy when you start reading the Name  column. All it needs to know is the number where  

play05:25

the Name values ended, in this case, row 10. This  means I just need the Tier rows from 1 to 10. Now,  

play05:32

you've probably noticed how the columnar  database was repeating 'basic' over and  

play05:37

over. This is actually another advantage of a  columnar database. By using the numbering system,  

play05:42

columnar databases can utilize algorithms  to simplify data retrieval. And you’ll find  

play05:48

that each columnar database system has highly  sophisticated methods for achieving further  

play05:54

performance enhancements. So, compared to  traditional databases, a columnar database  

play05:58

does not actually read billions of rows of data.  Instead, it just reads a few long columns. This  

play06:04

can be a very straightforward way of addressing  the large amounts of data that corporations have  

play06:10

to deal with.. They're not replacements for  traditional databases, but they are certainly  

play06:15

a powerful way of performing highly aggregated  analysis. I think Apache Cassandra is one of the  

play06:21

most popular open-source columnar databases that  is widely adopted by many Fortune 500 companies.

play06:28

Now, both relational and columnar databases  require you to define a schema before you can  

play06:34

store anything. If your requirements change over  time, you may need to add additional columns or  

play06:40

perhaps change the data type of one of them.  What if you have an e-commerce application  

play06:44

with thousands of products? Different products  usually have a different number of attributes.  

play06:50

Managing thousands of attributes in relational  databases is inefficient and can slow down your  

play06:55

database. This is an ideal use case for the  document database. You can store a product  

play07:00

with all its attributes in a single document. It  simplifies management and improves the reading  

play07:05

speed. And if you change the attribute of  one product, it won’t affect others. So,  

play07:11

what is a document database? Well, it’s a type  of database used to store and query data in  

play07:16

JSON-like documents. JSON is a data format that  is both human and machine-readable. Developers  

play07:23

can use JSON documents in their code and save them  directly into the document database. The flexible,  

play07:28

semi-structured, and hierarchical nature of  documents and document databases allows them  

play07:34

to evolve with the needs of applications.  Let's take a look at the structure of JSON  

play07:39

documents. JSON represents data in three  ways. The first is key-value pairs. These  

play07:44

pairs are recorded within curly braces. The key  is a string, and the value can be any data type,  

play07:50

such as an integer, decimal, or boolean. For  example, a simple key-value pair is {"year":  

play07:56

2024}. Next is the array. An array is an ordered  collection of values defined within left ([) and  

play08:03

right (]) brackets. Items in the array are  comma-separated. For example, {"fruit": ["apple",  

play08:08

“grapes”]}. And finally, objects. An object is  a collection of key-value pairs. Essentially,  

play08:13

JSON documents allow developers to embed objects  and create nested pairs. For example, {"address":  

play08:20

{"country": "USA", "state": “Texas”}}. Now,  here is an example of a JSON-like document that  

play08:26

describes a film dataset. You can see that the  JSON document holds simple values, arrays, and  

play08:32

objects quite flexibly. You can even have an array  with JSON objects within it. Document-oriented  

play08:38

databases allow you to create an unlimited-level  hierarchy of embedded JSON objects. It's entirely  

play08:45

up to you what schema you want to give to  your document store. Now, let's talk about  

play08:49

the operations that you can perform on a document  database. You can create, read, update, and delete  

play08:56

entire documents stored in the database. Document  databases provide a query language or an API that  

play09:02

allows developers to run these operations.  Okay, first, you can create documents in the  

play09:08

database. Each document has a unique identifier  that serves as a key. Then, you can use the API  

play09:14

or query language to read document data. You can  run queries using field values or keys. You can  

play09:20

also add indexes to the database to increase read  performance. And finally, you can update existing  

play09:26

documents flexibly. You can either rewrite the  entire document or update individual values. So,  

play09:33

what are the advantages of document databases? The first thing that comes to mind is the ease  

play09:38

of development. JSON documents map to objects—a  common data type in most programming languages.  

play09:43

When building applications, developers can  flexibly create and update documents directly  

play09:49

from the code. This means they spend less time  creating data models beforehand. As a result,  

play09:55

application development becomes more rapid and  efficient. Next, there's flexible schema. A  

play10:01

document-oriented database allows you to  create multiple documents with different  

play10:05

fields within the same collection. This can be  handy when storing unstructured data, like emails  

play10:12

or social media posts. However, some document  databases offer schema validation, allowing you  

play10:19

to impose some restrictions on the structure.  Another advantage is performance at scale.  

play10:24

Document databases offer built-in distribution  capabilities. You can scale them horizontally  

play10:29

across multiple servers without impacting  performance, which is also cost-efficient. Also,  

play10:35

document databases provide fault tolerance and  high availability through built-in replication.  

play10:41

If you want to try it out, MongoDB is a great  option. It's the most popular open-source document  

play10:46

database available, easy to start with, has SDKs  for most languages, and has large community.

play10:53

Have you ever seen a detective board in the  movies, with pictures, news articles, and notes  

play10:59

connected by thumbtacks and yarn? Immediately,  you can see the power of connecting the dots  

play11:04

in all of those relationships. Imagine taking that  detective board and applying a mathematical engine  

play11:10

that could query its data relationships. Well,  that is a graph database. I want to explain graph  

play11:16

databases by starting with relational databases.  One of the main traits of a relational database  

play11:22

is the constraining nature of its relationships,  which makes it ideal for processing transactions.  

play11:28

However, these strict constraints often make it  too complex to answer questions about distant  

play11:35

relationships. Imagine you had all the university  professor and student data ever collected. Then  

play11:41

imagine you wanted to know the relationship  between a group of 10 students originating from  

play11:46

completely different universities. Well, at first  glance, you would think that since the students  

play11:52

didn't go to the same universities, they really  don't have a connection. But if we look at their  

play11:57

professors, we can discover that they all shared  a common professor when they were students. So now  

play12:04

let's zoom in and describe some of the traits  of a graph database. First, you have nodes,  

play12:09

which are essentially records. Connected to these  nodes is a type of relationship, which can have a  

play12:15

direction and a property associated with it. So in  our case, the direction points from the original  

play12:21

professor. The relationship type is 'student_of.'  And the property is the year and semester where  

play12:27

they were taught. Now, querying this database  isn't like your typical SQL query. Graph database  

play12:33

vendors often have their own query language,  so this is something that the industry is still  

play12:38

working out. You need to be very careful with  graph databases because they can infer connections  

play12:44

that don't actually mean anything. For example,  imagine the inference you could make if all the  

play12:49

students in our previous example ended up dropping  out of school. Does that mean the original  

play12:55

professor had some kind of meaningful impact on  that bad outcome? Well, anything is possible,  

play13:01

but we have to be a little more skeptical  of such conspiratorial patterns. So, graph  

play13:06

databases are usually a mechanism for starting  questions but not necessarily answering them.

play13:12

So, what we see is data science communities  using graph databases to test inferences. The  

play13:18

discovery of these relationships, and  their relevance to the organization,  

play13:23

often is what gets bubbled up into  data warehouses. Thus closing the loop.

play13:28

One open source graph database  that I can suggest is ArangoDB.

play13:32

Now, information comes in many forms. Some  information is unstructured, like text documents,  

play13:38

pictures, videos, and audio, while some is  structured, like application logs, tables,  

play13:44

and graphs. On the other side, we have vector  databases that store data as high-dimensional  

play13:50

vectors. Each vector has a certain number of  dimensions, ranging from tens to thousands,  

play13:57

depending on the complexity of the data. Now, we  can apply some kind of transformations to the raw  

play14:02

data. We can encode all types of data into vectors  that capture the meaning and context of the asset.  

play14:09

This allows us to find similar assets by searching  for neighboring data points. Vector search methods  

play14:16

enable unique experiences, such as taking a  photograph with your smartphone and searching for  

play14:22

similar images. You can also find documents that  are similar to a given document based on their  

play14:27

topic and sentiment. And find products that are  similar to a given product based on their features  

play14:33

and ratings. It just a few examples. You can  try milvus which is open source vector database.

play14:40

Now, let's talk about a key-value database. It  stores data as a collection of key-value pairs,  

play14:47

where a key serves as a unique identifier.  Both keys and values can be anything,  

play14:53

ranging from simple objects to complex  compound objects. The document database  

play14:58

is a special type of key-value store  where keys can only be strings. Also,  

play15:03

when querying your document store, you can  read the entire value or a part of the value,  

play15:08

especially if the value is another JSON object.  For example, you can have {"book": {"id": 1,  

play15:14

"price": $10}}, then query book.price, and the  database will return the value 10. Key-value  

play15:20

databases always return the whole value,  including both ID and price information.  

play15:26

There are a few advantages of key-value databases. The first is scalability. Most key-value databases  

play15:33

can scale horizontally and automatically  distribute data across servers to reduce  

play15:38

bottlenecks at a single server. Then there's  ease of use. Key-value databases follow the  

play15:44

object-oriented paradigm, allowing developers  to map real-world objects directly to software  

play15:51

objects. And Performance. Unlike relational  databases, key-value databases don't have  

play15:55

to perform resource-intensive table  joins, which makes them much faster.

play16:00

Now let's talk about use cases.

play16:02

First is session management. A session-oriented  application, such as a web application, starts  

play16:07

a session when a user logs in and remains active  until the user logs out or the session times out.

play16:14

You can use it for a shopping cart. An  e-commerce website may receive billions  

play16:18

of orders per second during the holiday shopping  season, so you need a high-performance database.

play16:24

You can also use it for a metadata  storage engine. Your key-value  

play16:28

store can act as an underlying storage  layer for higher levels of data access.

play16:34

And caching: You can use a key-value database  to store data temporarily for faster retrieval.

play16:40

If you need a reliable key-value database,  you can take a look at etcd. Kubernetes  

play16:45

uses etcd to store all its cluster data,  such as pods, deployments, and services.

play16:50

And finally, let's talk about time series  databases. As the name suggests, it's optimized  

play16:56

for time-stamped or time series data. Time series  data are simply measurements or events that are  

play17:03

tracked, monitored, downsampled, and aggregated  over time. This could include server metrics,  

play17:08

application performance monitoring, network data,  sensor data, events, clicks, trades in a market,  

play17:15

and many other types of analytics data.  It's used over traditional database types  

play17:20

because it enables fast data retrieval and is a  cost-effective storage solution for that type of  

play17:26

data. One example of general-purpose metrics  database is InfluxDB. We also frequently use  

play17:33

Prometheus to collect time series data from  infrastructure. Prometheus is also a time  

play17:38

series database with additional features to  query targets such as VMs and Kubernetes pods.

play17:44

Each database type has its specialty: Relational  for structured data and ACID compliance,  

play17:50

Columnar for analytics, Document  for unstructured data flexibility,  

play17:55

Graph for complex relationships,  Time-Series for time-stamped data,  

play18:00

Vector for AI and ML scenarios, and  Key-Value for simple, fast data access.

play18:06

Using the right database type can be a  game-changer for performance, and the wrong one  

play18:11

can slow down your queries and be very expensive.  If you want to learn about relational databases  

play18:17

and SQL joins, you can watch this video. Thank you  for watching, and I’ll see you in the next lesson.

Rate This

5.0 / 5 (0 votes)

Etiquetas Relacionadas
Database TypesData ManagementRelational DBColumnar DBDocument DBGraph DBTime SeriesVector DBKey-Value DBACID ComplianceScalability
¿Necesitas un resumen en inglés?