Google SWE teaches systems design | EP22: HBase/BigTable Deep Dive

Jordan has no life
26 Apr 202210:55

Summary

TLDRIn this video, the host celebrates the one-month anniversary of their channel and dives into a detailed exploration of HBase, a NoSQL database modeled after Google's Bigtable. They discuss HBase's architecture, including its master and region servers, and how it uses LSM trees and SS tables for efficient random read and write performance. The video compares HBase with Cassandra, highlighting differences in their use cases, and touches on HBase's strong consistency and integration with big data processing frameworks like MapReduce and Spark. The host concludes by differentiating HBase's suitability for data lakes and analytics from traditional data warehouses.

Takeaways

  • ๐ŸŽ‰ The speaker celebrated the one-month anniversary of their channel with 260 subscribers.
  • ๐Ÿ“š The video discusses HBase, a NoSQL database modeled after Google's Bigtable.
  • ๐Ÿ” The speaker had to research HBase deeply due to the lack of quality information online, with many blog posts plagiarizing each other.
  • ๐ŸŒ HBase uses LSM trees and SS tables to achieve better random read and write performance with lower latency compared to HDFS.
  • ๐Ÿ“ˆ HBase is a wide column store NoSQL database, similar to Cassandra but with significant differences.
  • ๐Ÿ”‘ HBase organizes data with a single row key and column families, where the row key can be composed of multiple parts.
  • ๐Ÿ’พ The architecture of HBase includes a master server for metadata and operations, region servers for handling data, and uses ZooKeeper for coordination.
  • ๐Ÿš€ HBase is not ideal for time series data due to potential hotspots caused by sequential writes to a single partition.
  • โ™ป๏ธ Writes in HBase first go to an in-memory LSM tree and then are flushed to SS tables, which are stored in a column-oriented format on HDFS.
  • ๐Ÿ”’ HBase provides strong consistency for data replication, ensuring all writes are successful before being considered complete.
  • ๐Ÿ“Š HBase integrates well with analytics engines like MapReduce, Spark, and Tez due to its column-oriented storage on HDFS.
  • ๐Ÿš€ HBase is suitable for use cases involving large-scale batch jobs and stream processing, especially in a data lake scenario.

Q & A

  • What is the significance of the one-month anniversary mentioned in the script?

    -The one-month anniversary mentioned in the script signifies that the speaker's channel has reached a milestone of one month, and they are excited to have 260 subscribers, indicating growth and engagement with the audience.

  • What is HBase and why is it discussed in the script?

    -HBase is a wide column NoSQL storage database that is discussed in the script due to its relevance to the topic of NoSQL databases and its use of LSM trees and SS tables for better performance compared to HDFS.

  • What are the key differences between HBase and Cassandra mentioned in the script?

    -The script mentions that while both HBase and Cassandra are wide column NoSQL databases, they differ significantly in terms of their architecture, replication strategy, and suitability for different use cases such as real-time transactions and analytics processing.

  • How does HBase achieve better random read and write performance compared to HDFS?

    -HBase uses LSM trees and SS tables to achieve better random read and write performance. This allows for lower latency reads and writes, as opposed to the high latency and sequential appends/truncates of HDFS.

  • What is the role of the master server in HBase?

    -The master server in HBase is responsible for storing file metadata, handling operations on the metadata, and managing the location of files. It also performs range-based partitioning based on the row key and can split partitions if they become too large or have too much load.

  • What is the purpose of the region server in HBase?

    -The region server in HBase handles the actual storage and retrieval of data. It runs on HDFS data node servers and maintains an in-memory LSM tree data structure for efficient writes. It also communicates with HDFS for data replication and storage of SS tables.

  • How does HBase integrate with HDFS for data replication?

    -HBase integrates with HDFS by using the HDFS data node as the region server and leveraging the replication pipeline of HDFS to maintain the required replication factor for SS tables, ensuring data redundancy and fault tolerance.

  • What is the significance of column-oriented storage in HBase?

    -Column-oriented storage in HBase allows for high read throughput when reading values over a column or partition, which is beneficial for analytics processing and batch jobs that require scanning large amounts of data.

  • How does HBase differ from a traditional SQL database in terms of data handling?

    -HBase, being a NoSQL database, does not have structured data and does not support native joins like a SQL database. Instead, it requires batch processes or other methods to perform data associations and joins.

  • What is the advantage of using HBase for a data lake scenario?

    -Using HBase for a data lake scenario allows for the dumping of unstructured data into a format that can later be used for analytics. HBase provides the advantage of good read and write performance for random access, which is useful for transaction processing in addition to analytics.

  • What are the limitations of HBase when it comes to real-time transaction processing?

    -While HBase uses LSM trees for efficient writes, it may not handle writes as fast as Cassandra, which is designed for real-time transaction processing. HBase is more suited for scenarios where high read throughput for analytics is required.

Outlines

00:00

๐Ÿ“… One-Month Anniversary and Introduction to HBase

The speaker celebrates the one-month anniversary of their channel, expressing excitement over the 260 subscribers and encouraging continued growth. They introduce the topic of HBase, a NoSQL database, expressing frustration with the lack of in-depth information online and the prevalence of plagiarized content. The speaker decides to delve deeper into HBase, comparing it to Cassandra, another wide-column NoSQL database, and highlighting the differences despite their similar functionalities. They also mention that HBase uses LSM trees and SS tables to achieve better performance in random read and write operations, contrasting with the high latency of HDFS.

05:01

๐Ÿ” HBase Data Model and Architecture Overview

This paragraph delves into the data model of HBase, which is a wide column store with a single row key and any number of column values grouped into column families. The speaker discusses the importance of designing an effective row key to ensure proper data sorting. They provide an architectural overview of HBase, including the roles of the master server, region server, and the use of ZooKeeper for coordination. The master server manages file metadata and operations, while the region server handles data storage and retrieval, leveraging LSM trees for efficient writes and SS tables for sorted, column-oriented storage. The paragraph also touches on the replication process and the strong consistency provided by HBase due to its reliance on HDFS.

10:02

๐Ÿ”„ HBase Replication and Analytics Processing

The speaker explains the replication process in HBase, where writes are first stored in an in-memory LSM tree and then flushed to HFiles, which are SS tables. These HFiles are propagated to HDFS data nodes and replicated to meet the required replication factor, ensuring data durability and availability. The paragraph also discusses the advantages of storing SS tables on HDFS for high read throughput and integration with big data processing engines like MapReduce, Spark, and Tez. The speaker notes that HBase is not suitable for SQL queries or joins due to its non-SQL nature, but is ideal for batch processing and analytics on large datasets, making it a good fit for data lakes where unstructured data is stored for later analysis.

Mindmap

Keywords

๐Ÿ’กHBase

HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable. It is designed to provide real-time read/write access to large amounts of structured data. In the video, HBase is discussed as a wide column NoSQL storage database that uses LSM trees and SS tables for better performance compared to HDFS, and is compared with Cassandra in terms of functionality and use cases.

๐Ÿ’กWide Column NoSQL Database

A wide column NoSQL database is a type of non-relational database that stores data in tables with a dynamic number of columns. The script mentions that both HBase and Cassandra are wide column NoSQL databases, but they differ significantly in their architecture and use cases, which is a central theme of the video.

๐Ÿ’กLSM Trees

LSM Trees, or Log-Structured Merge-Trees, are a type of data structure used in database systems to optimize write operations. The video explains that HBase uses LSM trees to achieve better random read and write performance with lower latency compared to the high latency writes and reads provided by HDFS.

๐Ÿ’กSS Tables

SS Tables, or Sorted String Tables, are a part of the LSM tree data structure and are used for storing data on disk. The script discusses how HBase uses SS tables in conjunction with LSM trees to handle data that has been flushed from memory, allowing for efficient read operations.

๐Ÿ’กRow Key

In the context of HBase, the row key is a unique identifier for each row in a table. The video script explains that the row key can be comprised of multiple parts and is crucial for determining the sorting and partitioning of data within the database.

๐Ÿ’กColumn Family

A column family is a group of columns in a wide column store that are often accessed together. The script mentions column families in the context of HBase's data model, where each row has a row key and can have any number of column values grouped into families.

๐Ÿ’กMaster Server

The master server in HBase is responsible for managing the metadata of the file system and coordinating operations such as file renaming. The video script describes the role of the master server in HBase's architecture, comparing it to the NameNode in HDFS.

๐Ÿ’กRegion Server

A region server in HBase is responsible for serving a particular range of data. The script explains that region servers are usually run on HDFS data node servers and handle the in-memory LSM tree data structure for efficient read and write operations.

๐Ÿ’กZookeeper

Zookeeper is a coordination service used for maintaining configuration information, naming, providing distributed synchronization, and providing group services. In the video, Zookeeper is mentioned as being used by HBase for maintaining an eventually consistent write-ahead log and for region server heartbeats.

๐Ÿ’กReplication

Replication in the context of HBase refers to the process of copying data to multiple nodes to ensure data availability and fault tolerance. The script discusses how HBase leverages HDFS's replication mechanism to maintain the required replication factor for SS tables.

๐Ÿ’กData Lake

A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed. The video script mentions data lakes in the context of using HBase for storing unstructured data and running analytics on it later, contrasting it with the structured approach of data warehouses.

Highlights

The channel reached its one-month anniversary with 260 subscribers.

HBase is a NoSQL storage database modeled after Google's Bigtable.

HBase uses LSM trees and SS tables for better random read and write performance.

HBase is a wide column database with a row key and column families.

Row keys in HBase can be composed of multiple parts for sorting data.

HBase architecture includes a master server, region server, and replication with a Zookeeper instance.

The master server in HBase is similar to the HDFS Name Node, handling file metadata and operations.

HBase is not suitable for time series data with a timestamp as the row key due to potential hotspots.

Region servers in HBase run on HDFS data node servers and handle reads and writes.

Writes in HBase first go to an in-memory LSM tree structure before being flushed to SS tables.

SS tables in HBase are stored in a column-oriented format for high read throughput.

HBase integrates well with MapReduce and dataflow engines like Spark and Tez for analytics.

HBase does not support SQL queries or structured data with native joins.

HBase is suitable for data lakes, allowing transaction processing with the ability to run batch jobs.

Cassandra is better for real-time transaction processing with user-facing applications.

HBase is master-oriented with a focus on read throughput for analytics processing.

HBase provides advantages over HDFS for transaction processing with improved read and write performance.

The speaker plans to discuss key-value stores, such as Riak, Redis, Memcached, and caching in general, in upcoming videos.

Transcripts

play00:00

all right um i'm back again uh welcome

play00:04

everyone

play00:05

i should just say yesterday was the one

play00:06

month anniversary of this channel so i'm

play00:08

pretty you know overwhelmed and excited

play00:10

that we already have 260 people

play00:12

subscribed so let's keep those numbers

play00:15

up but uh today we're going to talk

play00:16

about hbase it's kind of wild actually

play00:18

because i was doing research for this

play00:20

topic in order to make a video about it

play00:22

and there's literally like probably 10

play00:24

blog posts which are just blatant

play00:26

plagiarisms of one another and just copy

play00:28

the same like bullcrap information that

play00:30

go into absolutely zero depth so i

play00:33

actually had to dive into a little bit

play00:34

here and then right as all that happens

play00:37

it occurs to me that hbase is modeled

play00:39

after bigtable which there's literally a

play00:41

ton of information on so i'm an idiot

play00:44

but anyways uh let's get into this video

play00:48

all right so hbase what is it um well

play00:51

the reason i'm talking about hbase in

play00:53

the order that i'm talking about it in

play00:55

is a couple things first of all it

play00:57

allowed me to touch on hdfs in the prior

play00:59

video but more importantly hbase is

play01:02

another wide column nosql storage

play01:04

database and the reason i want to talk

play01:06

about this now is the last database i

play01:08

talked about cassandra was also a wide

play01:10

column nosql database however there are

play01:13

some pretty serious differences between

play01:15

the two even though on the surface level

play01:17

they look like they serve similar

play01:18

functionalities

play01:20

so basically we're going to compare

play01:21

those two but just to give an overview

play01:25

even though hadoop or the the file

play01:27

system provides pretty high latency

play01:28

rights and reads since they're all

play01:30

straight from disk

play01:32

and only allows basically sequential

play01:34

appends and truncates we'll see how

play01:36

hbase uses lsm trees and ss tables in

play01:39

order to achieve

play01:41

better random read and write performance

play01:43

in addition to lower latency reads and

play01:46

writes performances

play01:48

okay so in terms of the data model hbase

play01:52

is a wide column database so it's not

play01:54

identical to cassandra but it looks

play01:55

pretty damn similar where each row has a

play01:58

single row key and then basically any

play02:00

column values that they want and there's

play02:02

this concept of like a column family as

play02:04

well which is kind of like you know

play02:06

combining uh you know two values to

play02:08

create a column

play02:10

the row key also can and probably should

play02:12

be comprised of multiple parts so as

play02:14

opposed to having those clustering keys

play02:16

in cassandra where that allows like an

play02:18

internal sort order instead row keys

play02:21

just have you know maybe say multiple uh

play02:24

delineations inside of them and you have

play02:26

to be pretty clever when developing this

play02:28

row key to make sure that your data is

play02:29

sorted the way that you want it

play02:32

okay so just as an image to kind of give

play02:35

you a sense of the architectural

play02:37

overview

play02:38

we have this master server a region

play02:40

server and then you know kind of the

play02:42

replication that's going on in the

play02:44

background there's also a zookeeper

play02:45

instance which i'll mention a little bit

play02:47

as well but keep in mind that um you

play02:49

know if you're ever talking about hbase

play02:51

it's very much modeled after google's

play02:53

bigtable which basically does the same

play02:55

exact thing as hbase however it's built

play02:57

on the google file system or now

play02:59

actually the updated version of the

play03:01

google file system called colossus

play03:04

so what is the master server well if you

play03:06

remember the hdfs name node it's

play03:08

actually very similar to that but it's

play03:10

kind of it's its own thing in hbase so

play03:13

it's going to store all the file

play03:14

metadata and you know all the operations

play03:16

that occur on the metadata

play03:18

such as like renaming files or anything

play03:20

like that and also the corresponding

play03:22

chunks of where all the files are

play03:23

located

play03:25

and so this is basically how hbase does

play03:27

its partitioning

play03:28

it's a range based partitioning on that

play03:31

original row key

play03:33

and if a given partition gets too big or

play03:35

it has too much load

play03:37

you can go ahead and split that up

play03:39

the one thing to note here is that means

play03:41

that hbase is not good for usage

play03:44

patterns

play03:45

say with like time series data where you

play03:48

know the time stamp is the row key

play03:49

because then every single write is going

play03:51

to be going to one partition at a time

play03:53

and it's going to create hotspots so in

play03:55

that case you might want to use

play03:56

something like a hash of a key

play03:58

okay

play03:59

in terms of reads and writes the client

play04:01

is going to reach out to that hmaster

play04:03

server in order to figure out the

play04:04

location of the file and then you know

play04:06

it's going to lead it to a region server

play04:09

which i'll touch upon next and then

play04:11

presumably in order to ensure high

play04:13

availability the master server like the

play04:16

master server and htfs can use that

play04:18

zookeeper instance in order to keep a

play04:21

eventually consistent write ahead log

play04:23

and that way you can have a backup

play04:24

master server reading from zookeeper in

play04:26

order to eventually pull that right

play04:28

ahead log and stay consistent with the

play04:29

master server

play04:32

okay what is a region server well a

play04:34

region server before you guys kind of

play04:36

get into the wrong mindset here a region

play04:38

server tends to be run on hdfs data node

play04:41

servers so as opposed to being its own

play04:43

dedicated server usually it's just a

play04:46

second program that's being run on those

play04:48

data nodes in conjunction with whatever

play04:50

the program is running that actually you

play04:53

know makes a computer a data node so

play04:55

what the region server does is it

play04:57

occasionally sends heartbeats to

play04:58

zookeeper so basically says okay i'm

play05:00

alive you can still send rights to me or

play05:02

reads to me

play05:03

additionally on that actual region

play05:05

server we're holding an in-memory lsm

play05:08

tree style data structure and that's

play05:10

where rights are going to first be set

play05:12

so if you remember from literally my

play05:14

first video on this channel we send

play05:15

rights to the lsm tree

play05:17

this is a pretty efficient write

play05:19

structure because it means that they

play05:20

first go to memory

play05:21

and then reads once you want them first

play05:23

go to the lsm tree and if the lsm tree

play05:26

gets too big it gets flushed to ss

play05:28

tables and then if whatever the key it

play05:31

is that you want to read is not in the

play05:32

lsm tree you look towards the ss tables

play05:35

and start sorting through those and

play05:36

there are a bunch of optimizations on

play05:38

that that you can do in order to kind of

play05:40

speed up that read process like bloom

play05:42

filters

play05:43

and you know other types of caching

play05:47

so ss tables are actually going to be

play05:49

stored in column oriented format so that

play05:51

means that as opposed to just having

play05:54

you know the the key and then the entire

play05:56

value of the row what you have are these

play05:58

sorted tables where

play06:00

it's all the values of the column in

play06:02

turn so that column oriented storage is

play06:05

something that i've mentioned in my data

play06:07

data warehousing video but basically

play06:10

what it means that you can achieve

play06:11

really high throughput over an entire

play06:13

table if you say you just want one

play06:15

column

play06:17

okay so now let's talk about replication

play06:19

because this is kind of the whole point

play06:21

of being built on hdfs in addition to

play06:24

kind of the analytical processing that

play06:25

we'll talk about next

play06:27

so like i said region nodes generally

play06:30

speaking are going to be putting rights

play06:32

in this mem store and then once that mem

play06:34

store gets too big they're going to

play06:35

flush them to an h file which is just an

play06:38

ss table

play06:39

and since the region node is generally

play06:41

speaking on or near an hdfs data node

play06:45

all it's going to do is once that hdfs

play06:47

file so the h file is going to be

play06:50

propagated to that data node the data

play06:52

node is going to use the replication

play06:54

pipeline of hdfs that we talked about in

play06:56

the previous video on this channel in

play06:58

order to go ahead and reach the required

play07:00

replication factor of that ss table so

play07:03

even though the region server is

play07:05

effectively going to be handling all

play07:06

rights and reads for a given

play07:09

like partition of data it may

play07:11

technically be communicating with one or

play07:14

many of the replicas

play07:16

that is going to be holding that ss

play07:18

table for example if a given data node

play07:20

goes down the region server is probably

play07:22

going to have to be talking to a

play07:24

different data node

play07:27

um oh and furthermore uh we should just

play07:30

keep in mind the fact that technically

play07:32

that is strong consistency because um

play07:36

everything has to kind of succeed in

play07:38

hdfs for

play07:39

the replication to be considered

play07:41

successful and the right to be

play07:42

considered successful in the first place

play07:45

but okay in terms of the analytics

play07:47

perspective what good does it do us

play07:49

storing all of these ss tables on hdfs

play07:52

as opposed to just you know

play07:54

kind of having them on any normal

play07:56

database

play07:57

well for starters the fact that they are

play07:59

column oriented storage means that we

play08:01

can have really high read throughput

play08:03

when trying to read over all the values

play08:05

on one partition and a column

play08:07

and additionally built being built on

play08:09

hdfs means really good integration with

play08:11

mapreduce and dataflow engines like

play08:13

spark and tes

play08:15

one thing to note is that since hbase is

play08:17

not a sql database it doesn't have

play08:18

structured data you can't actually

play08:20

perform native joins what you have to do

play08:23

is instead use something like a batch

play08:25

process to perform joints and i talked

play08:26

about those in the batch process video

play08:29

basically you are trying to find data

play08:31

associations perhaps even using two sets

play08:33

of mappers and then combining those

play08:35

together into one set of producers

play08:39

okay so in conclusion even though hbase

play08:42

looks really similar to cassandra on a

play08:43

surface level they're very very

play08:45

different in terms of what you want to

play08:46

be doing with them cassandra's really

play08:48

nice for real-time transactions

play08:50

processing so that means that basically

play08:52

anything user facing where you just want

play08:54

like a really quick write so you can get

play08:56

them a really quick success message and

play08:57

then have that in the database to you

play08:59

know have that eventual consistency

play09:01

between all of these cassandra instances

play09:04

that's good that's kind of like this

play09:06

leaderless replication strategy it's

play09:08

very um based off dynamo hbase on the

play09:10

other hand is a master

play09:12

you know master oriented database in

play09:14

terms of the replication strategy and so

play09:16

even though they both use lsm trees

play09:19

hbase isn't going to be able to handle

play09:20

those rights as fast

play09:22

however in terms of reads and analytics

play09:25

processing

play09:26

you can achieve very high read

play09:28

throughput over a column which is super

play09:30

useful if you want to run a huge batch

play09:32

job or a stream job

play09:34

this is great um for something called

play09:37

being a data lake which is basically

play09:39

saying i'm going to dump a ton of

play09:40

unstructured data in kind of like a

play09:42

database format into one place and then

play09:44

eventually run analytics later it's just

play09:47

that the advantage of doing this with

play09:49

hbase over hdfs itself is that you can

play09:52

get

play09:52

nice read and write performance for both

play09:55

random reads and writes in the event

play09:57

that you do have to do some general like

play10:00

you know transaction processing so it's

play10:02

basically a data lake that allows you to

play10:04

do some transaction processing but keep

play10:06

in mind that data lakes also are not

play10:08

exactly the same as data warehouses

play10:10

which are specifically for structured

play10:12

data with a ton of data associations if

play10:15

you recall data warehouses use that

play10:17

whole like stars and snowflake schema

play10:19

where you have a huge fact table which

play10:20

is very structured and then a ton of

play10:22

dimension and sub-dimension tables which

play10:24

reference the fact table or actually the

play10:26

fact table is going to reference those

play10:28

dimension tables

play10:29

but the point is if you just want to be

play10:31

able to run sql queries here hbase is

play10:33

not the thing but if you want to be able

play10:35

to dump in a ton of data and then

play10:37

eventually run you know a ton of

play10:39

expensive batch jobs on it then hbase is

play10:41

really nice for you

play10:43

so yeah all right guys i think uh next

play10:45

i'm going to start moving into key value

play10:47

stores so i can talk about things like

play10:49

ryak redis memcached and just caching in

play10:52

general all right have a good

Rate This
โ˜…
โ˜…
โ˜…
โ˜…
โ˜…

5.0 / 5 (0 votes)

Related Tags
HBaseCassandraNoSQLDatabasesBigTableLSM TreesSSTablesHDFSData LakesAnalyticsTech Talk