98 Percent Cloud Cost Saved By Writing Our Own Database

ThePrimeTime
20 Apr 202421:45

Summary

TLDRThe video script discusses the decision-making process and the technical challenges behind building a custom database for a cloud platform that tracks thousands of people and vehicles simultaneously. The company, jokingly referred to as 'NSA' due to its European roots, faced escalating costs with Amazon Aurora and the limitations of on-premise database clusters. To address these issues, they developed a bespoke in-process storage engine that significantly reduced cloud costs by 98%. The custom solution, named 'aen', is a binary format that prioritizes high performance and low storage space, with a unique 4-byte identifier for each entry and a separate index file for fast retrieval. Despite the skepticism about the engineering costs and the lack of a version field in their binary encoding, the company claims that their approach provides the exact functionality they need without losing any features. They also mention the use of AWS services like EBS and Glacier for data storage and the importance of considering versioning in binary protocols for future adaptability.

Takeaways

  • 🚀 **Innovative Database Development**: The company saved 98% on cloud costs by developing their own database, tailored to their specific needs for tracking and storing location data.
  • 🔍 **High Performance Requirements**: They needed a database that could handle up to 30,000 location updates per second per node, with the ability to buffer these updates.
  • 📊 **Data Compression**: The custom database uses a minimal delta-based binary format, which significantly reduces the storage space needed, allowing for about 30 million location updates per gigabyte.
  • 💾 **Storage Efficiency**: They replaced a costly Aurora instance with a much cheaper elastic block storage volume, which, combined with their custom storage engine, led to the massive cost reduction.
  • ⏱️ **Speed Improvements**: Queries and data retrieval have become much faster, with one example going from 2 seconds to 13 milliseconds for a specific operation.
  • 🔢 **Binary Data Format**: The database stores data in a binary format, which is more space-efficient but also requires careful design to accommodate future changes.
  • 📈 **Scalability**: The system is designed to allow for unlimited parallelism, with multiple nodes able to write data simultaneously without an upper limit.
  • 🌐 **Global Use Case**: The database serves a cloud platform that tracks a large number of people and vehicles, with use cases ranging from car rentals to precise location tracking for various industries.
  • 🔒 **Data Privacy and Loss Tolerance**: The company is comfortable with potentially losing some data due to buffering, which indicates a trade-off between data integrity and system performance.
  • 📝 **Lack of Versioning**: There is a noted absence of a version field in the binary format, which could be crucial for future compatibility and upgrades.
  • 📘 **Archiving Strategy**: Data older than 10 days is moved to AWS Glacier, which further reduces costs and is aligned with customer query habits.

Q & A

  • What is the main reason the company decided to build their own database?

    -The company decided to build their own database to save on cloud costs, which were upwards of $10,000 a month, and to handle the high volume of location updates efficiently.

  • What are the key performance requirements for the new database system?

    -The key performance requirements include handling up to 30,000 location updates per second per node, having unlimited parallelism for simultaneous writes across multiple nodes, and maintaining a small disk footprint due to the large volume of data.

  • How does the company's database system differ from a general-purpose database like PostgreSQL?

    -The company's database system is a purpose-built, in-process storage engine with a limited set of functionality that is bespoke to their specific needs, as opposed to a general-purpose database like PostgreSQL which offers an expressive query language and broader functionality.

  • What is the significance of the binary format used in the company's database system?

    -The binary format is significant because it allows for a minimal delta-based binary storage, which is highly space-efficient. This format enables the storage of about 30 million location updates in a gigabyte of space.

  • How does the company manage data consistency and privacy concerns?

    -The company acknowledges that they are okay with losing some data due to buffering and server failures. They maintain low consistency guarantees and are comfortable with the potential loss of one second's worth of updates in the current buffer.

  • What is RTK and how does it relate to GPS accuracy?

    -RTK stands for Real-Time Kinematics, a technique used to enhance the accuracy of position data from GPS signals. It can improve the accuracy to as low as 10 centimeters, which is significantly better than the traditional 6-meter accuracy.

  • Why did the company choose to move older data to AWS Glacier?

    -The company moved older data to AWS Glacier to reduce costs. Since their customers rarely query entries older than 10 days, archiving data that exceeds 30 gigabytes to Glacier is a strategic decision to optimize their EBS costs.

  • What is the role of the separate index file in the company's database system?

    -The separate index file translates the static string ID for each entry and its type to a unique 4-byte identifier, which allows for extremely fast retrieval of the history for a specific object.

  • How does the company ensure high uptime and reliability for their database system?

    -They use provisioned IOPS SSD io2 with 3,000 IOPS and batch updates to ensure high uptime. Additionally, EBS has built-in automated backups and recovery, which provides high uptime guarantees similar to what Aurora offered.

  • What is the primary storage format used by the company's database system?

    -The primary storage format is a minimal delta-based binary format, which includes flags, ID, type index, timestamp, and latitude and longitude data, with full state storage occurring every 200 writes.

  • What is the impact of the custom database system on query performance?

    -The custom database system has significantly improved query performance. For example, recreating a particular point in time in a realm's history went from around 2 seconds to 13 milliseconds.

Outlines

00:00

🤔 The Risks and Rationale of Building a Custom Database

The paragraph discusses the decision to build a custom database to save on cloud costs, despite it being a common first rule of programming to avoid such endeavors. The speaker expresses skepticism about writing one's own database, referencing Jonathan Blow's opinions on the matter and the complexities involved in database creation, such as ACID properties, sharding, and fault recovery. They also mention the existence of well-established databases that are free to use. The context of the decision is related to a cloud platform tracking numerous people and vehicles, with the need to handle a vast number of location updates efficiently.

05:01

💰 Cost Analysis and the Search for a Solution

This section delves into the cost implications of using AWS Aurora for the database, which amounts to over $10,000 per month. The company faces scalability issues as their customer base and data volume grow. The speaker contemplates whether a managed database service or a self-hosted database could have been a more cost-effective solution. They also discuss the requirements for their database system, emphasizing the need for high performance, the ability to handle a large number of updates per second, and low storage footprint.

10:02

🔍 Custom Storage Engine Design and Its Impact

The speaker details the design of a custom in-process storage engine that is part of their server's executable. This engine uses a minimal delta-based binary format, which results in a highly space-efficient storage system. The system allows for high write performance and is designed to handle a large number of location updates. The trade-off is a relaxed consistency model, where the system can tolerate losing a small amount of data. The outcome is a significant reduction in cloud costs, moving from a high-cost Aurora instance to a much cheaper elastic block storage volume.

15:02

🛠️ Lessons Learned from Building a Bespoke Storage System

The paragraph highlights the importance of having a version field in binary encoding, which the speaker considers a critical oversight in the described storage system. They discuss the potential need for future changes to the data format and the importance of being able to accommodate these changes. The speaker also mentions the use of AWS Glacier for long-term storage of older data to further reduce costs. They emphasize the speed improvements in data retrieval and the overall performance benefits of their custom solution.

20:04

😂 A Call for a Version Field and Open Sourcing the Project

In a lighter tone, the speaker humorously suggests that the creators of the product should add a version field to their database system. They express admiration for the article and the work done by the creators. The speaker also playfully addresses the idea of being 'bribed' by the NSA for promoting their product. They end on a note that encourages the use of version fields in binary protocols, drawing from their own experience in building TCP packets.

Mindmap

Keywords

💡Cloud Cost

Cloud Cost refers to the expenses incurred from using cloud services, such as Amazon Web Services (AWS), for data storage and computation. In the video, the company saved 98% in Cloud cost by developing their own database instead of using a managed service like AWS Aurora, which was costing them over $10,000 a month.

💡Database

A database is an organized collection of data, typically stored and accessed electronically. The video discusses the decision to write a custom database to handle specific data needs more efficiently, which is a departure from the common practice of using existing database management systems like PostgreSQL.

💡Geospatial Data

Geospatial data refers to information that is related to the geographic location and features of the Earth, such as coordinates and maps. The video mentions the use of geospatial data for tracking vehicles and people, which is crucial for the company's operations.

💡Amazon Aurora

Amazon Aurora is a fully managed relational database service provided by AWS. It is known for its high availability and automatic failover capabilities. The video discusses the high costs associated with using Aurora and the decision to migrate to a custom database solution.

💡PostgreSQL

PostgreSQL, often simply Postgres, is a powerful, open-source object-relational database system. It is mentioned in the video as an alternative to building a custom database, highlighting the common practice of using established systems instead of creating new ones from scratch.

💡Real-time Kinematics (RTK)

RTK is a technique used in GPS to achieve more precise location data down to the centimeter level. The video discusses RTK in the context of the high precision required for tracking vehicles and workers, emphasizing the need for accurate geospatial data.

💡Binary Format

Binary format refers to the way data is stored in a computer using only two symbols, typically 0 and 1. The video describes the custom database's use of a minimal delta-based binary format to store location updates efficiently, which significantly reduces storage space requirements.

💡In-memory Architecture

An in-memory architecture is a type of database management system where data is stored and managed in the main memory (RAM) rather than on disk. The video highlights the use of an in-memory architecture for fast query processing and real-time data streaming.

💡Elastic Block Store (EBS)

EBS is a service provided by AWS for block-level storage that can be attached to Amazon EC2 instances. The video discusses the use of EBS with provisioned IOPS SSD io2 for the custom database, which offers high input/output rates and improved performance.

💡Data Compression

Data compression is the process of reducing the size of data to save storage space and improve efficiency. The video explains how the custom database uses data compression techniques to store only deltas between updates, rather than full states, leading to significant space savings.

💡Version Field

A version field in binary encoding is used to manage changes in data format over time. The video script suggests that the custom database lacks a version field, which could be a potential oversight for future compatibility and updates to the data format.

Highlights

The company saved 98% in cloud costs by building their own database instead of using a managed service like AWS Aurora.

Building your own database is generally not recommended, but this case shows how it can make sense in certain scenarios.

The company needed to handle 30,000 location updates per second per node, which is beyond the capabilities of many existing databases.

They created a custom in-process storage engine that writes a minimal delta-based binary format, resulting in a 98% reduction in storage costs.

The custom database is part of the same executable as their core server, eliminating the need for a separate database server process.

They store the full state of an object every 200 writes, and in between store only the deltas, making the storage very efficient.

The database uses a separate index file to quickly retrieve the history for a specific object using a unique 4-byte identifier.

The custom database allows them to store location data in a highly compressed format, taking up minimal disk space.

They are okay with losing up to 1 second of location updates in the buffer in the rare event of a server failure.

The custom database provides faster query performance compared to their previous Aurora setup, with queries going from 2 seconds to 13 milliseconds.

They moved older data to AWS Glacier to further reduce costs, while still maintaining high uptime guarantees.

Writing to the custom database via the local file system is faster and has lower overhead compared to a managed database service.

The custom database is a binary file feed with limited functionality, but provides the exact features they need for their use case.

While creating a custom database is challenging and not recommended for most, this case shows the huge benefits it can provide for specific use cases.

The company open-sourced their custom database called Aen, allowing others to use and learn from their innovative solution.

The article provides a fascinating look at the tradeoffs and considerations involved in building a custom database for a specific use case.

Transcripts

play00:00

for those that don't know this article

play00:01

is clearly designed for me how we saved

play00:03

98% in Cloud cost by writing our own

play00:06

database I don't see how that's ever a

play00:08

good idea like yo dog why not just host

play00:13

postgress like how is building your own

play00:15

database the move I we'll find out here

play00:18

we go uh what is the first rule of

play00:21

programming maybe something like do not

play00:22

repeat yourself or if it works don't

play00:24

touch it or how about do not write your

play00:26

own database this actually is a great

play00:28

first rule of of programming it should

play00:31

be don't write your first don't write

play00:33

your own language which is why we

play00:35

have all the languages uh and two should

play00:38

be don't write your own database well

play00:40

too late I feel like databases are the

play00:42

only things growing faster than

play00:44

JavaScript Frameworks right now is that

play00:46

fair to say so Jonathan Blow disagrees I

play00:49

think the thing is is that I think

play00:50

Jonathan Blow disagrees but when you

play00:52

write your own language usually you

play00:54

start writing it after decades SL

play00:57

Decades of experience and I think at

play00:59

that point it's pretty okay to say hey I

play01:01

think I have an objectively better way

play01:04

to do something I actually do agree with

play01:06

that statement Jonathan blows a walking

play01:07

LTE l meaning lovely perfect he he has a

play01:11

lot of good takes I think he has a lot

play01:12

of bad takes especially his ones about

play01:14

open source I I'm like half in half out

play01:16

on those ones I don't really know where

play01:18

I land you know they're they're pretty

play01:19

good anyways that's a good one uh

play01:21

databases are a nightmare to write from

play01:23

autonity consistency isolation and

play01:26

durability uh requirements to sharding

play01:28

to fault recovery to admin rtion

play01:30

everything is hard beyond belief if you

play01:31

haven't seen the video we I have a video

play01:33

on my Channel about tiger beetle the

play01:36

presentation for it starts off a little

play01:38

slow but man it gets so good and then it

play01:42

turns into a video game they wrote in

play01:44

Zig to represent what's happening in the

play01:46

tiger beetle database it is wild it is

play01:51

wild like writing a good database and

play01:53

how they go about with testing and

play01:54

everything is just incredible

play01:56

fortunately there are amazing databases

play01:57

out there that have been polished over

play01:59

decades and don't cost a cent so why on

play02:01

Earth would we be foolish enough to

play02:02

write one from

play02:03

scratch well here's the thing this is

play02:06

actually how every bet this is how every

play02:08

bad meeting started for

play02:10

me all right we are running a cloud

play02:12

platform that tracks tens of thousands

play02:14

of people and vehicles

play02:16

simultaneously FBI

play02:18

NSA what's the name of your

play02:21

company oh it just it it just happens to

play02:24

be NSA okay cool what does NSA stand for

play02:27

uh every location updated is stored and

play02:29

can be retrieved via a history API the

play02:31

amount of simultaneous connected

play02:33

vehicles and the frequency of their

play02:34

location updates varies widely over time

play02:36

but having around 13,000 oh they're from

play02:38

the EU okay hey this is hey hey this

play02:42

isn't America people this ain't America

play02:45

this this not America that's not us

play02:47

that's you guys that's that's you that's

play02:49

on your side of the pond that's not my

play02:50

problem this must be

play02:52

marron's M Cron's creation okay uh

play02:55

simultaneous connections each sending

play02:57

around one update a second wow that is

play03:00

that's a decent amount of updates come

play03:01

flying in uh via just persistent

play03:04

connections right a squish mess our

play03:07

customers use this data in very

play03:09

different ways uh some use CA cases are

play03:12

very coarse EG when a car rental company

play03:14

wants to show an outline of the route a

play03:16

customer took that day this sort of

play03:18

requirement could be handled with uh 30

play03:20

to 100 location points for a 1-hour trip

play03:23

and allows us to heavily Aggregate and

play03:25

compress the location data before

play03:26

storing it oh yeah that makes sense okay

play03:28

I'm start I'm starting to understand

play03:29

what they're doing what they're tracking

play03:31

and kind of what they're reporting but

play03:32

there are many other use cases where

play03:34

that's not the option delivery companies

play03:36

that want to be able to replay the exact

play03:38

seconds leading up to an accident Minds

play03:40

that with very precise on-site location

play03:43

trackers that want to generate reports

play03:44

of which worker stepped into which

play03:46

restricted Zone by as little as half a

play03:48

meter what's the accuracy of a GPS I

play03:51

thought it was like 6 M has that uh has

play03:54

that changed three meters these days

play03:56

okay when I was doing stuff in college

play03:57

and and at post colle it was six well I

play04:00

mean 6 m is super accurate rtk can get

play04:03

down to 10 cm what the hell is rtk so I

play04:06

don't know I don't know rtk what's rtk

play04:08

rtk makes it even better what what is

play04:09

that I've never heard of rtk you

play04:11

wouldn't use GPS for this okay okay rtk

play04:13

so yep this beyond my abilities I

play04:15

haven't been I haven't been in the

play04:16

hardware bids for uh over a decade real

play04:18

time kinematics oh interesting so given

play04:21

that we don't know upfront what level of

play04:23

granularity each customer will need we

play04:25

store every single location update okay

play04:27

this makes sense in other words you have

play04:30

uh you have like a table then you have

play04:31

aggregate tables or some sort of

play04:32

post-processing tables I wonder why that

play04:35

doesn't work uh at 13,000 Vehicles

play04:38

that's 3.5 billion updates per month and

play04:40

that will only grow from here so far

play04:42

we've been using aw Aurora with the post

play04:44

gist uh extension for GE geospatial data

play04:48

storage but Aurora costs us upwards of

play04:51

$10,000 a month already just for the

play04:53

database alone and that will only become

play04:55

more expensive in the future okay but

play04:57

it's not just about Aurora pricing while

play04:59

Aurora holds up quite well under load

play05:01

many of our customers are using our on

play05:03

premise version and there they have to

play05:05

run their own database clusters which

play05:07

are easily overwhelmed by this volume of

play05:09

updates okay okay this makes sense I'm

play05:12

not going to lie these costs are pretty

play05:13

tame these costs are pretty tame as far

play05:16

as like a uh you know a tech company

play05:18

goes I mean they they actually have

play05:20

customers here they have literal they

play05:22

have literal captures right so we burnt

play05:25

about 28,000 on our Aurora migration

play05:27

this week yeah yeah I I figure a lot of

play05:29

people spend a lot more why aren't we

play05:31

just let's see this is such an

play05:33

interesting choice to make at this L I

play05:35

mean I guess if you're trying to Future

play05:36

proof yourself knowing it's going to go

play05:38

from 10,000 to say 100,000 over the

play05:40

course of the next two years maybe it

play05:42

makes more sense to start pre- preparing

play05:44

for this stuff but I'm just curious if

play05:46

maybe just not having a managed database

play05:48

but just hosting your own database would

play05:49

have been a better choice right maybe

play05:51

you could have reduced it without so

play05:53

much engineering talent and time and and

play05:56

and and all the things that go with you

play06:00

know what I mean uh unfortunately there

play06:01

is no such thing if there is and we

play06:03

somehow overlooked it in our research

play06:05

please let me know many databases from

play06:06

[ __ ] uh and H2 to reddis support redis

play06:09

boo can we Boo Boo redis Boo uh so for

play06:14

those that don't know uh redis uh is not

play06:16

Reedus the original open source as we

play06:18

learned like two days ago reddis is

play06:20

actually a company that usurped redus

play06:23

the open source what appears to be kind

play06:25

of pressured the guy to sell the IP the

play06:28

guy didn't really want to be a maintain

play06:29

or he wanted to go off and write

play06:30

Hardware or something so he was like all

play06:32

right whatever and he left and then

play06:34

reddis the company which was called

play06:36

something else it was called like redis

play06:39

redus online or something like that then

play06:41

became reddis and then changed the the

play06:44

labs redis labs they were from reddis

play06:46

labs to reddis and then Chang the

play06:48

license and all that yeah g gargan or

play06:51

garanta garanta or however it was and so

play06:54

anyways there's a good video on that but

play06:56

they are exclusively extensions that sit

play06:58

on top of existing DBS post just built

play07:00

on top of post squeal uh is probably the

play07:03

most famous one there are others like uh

play07:05

geomesa that offer great Geo spatial uh

play07:08

spatial queering capabilities on top of

play07:10

other storage engines unfortunately

play07:12

that's not what we need uh here's our

play07:15

requirement profile looks like extremely

play07:17

high right performance we want to be

play07:18

able to handle up to 30,000 location

play07:20

updates per second per node they can be

play07:22

buffered before writing leading to a

play07:24

much lower of uh iops so the thing is I

play07:27

don't know much about geospatial

play07:28

databases I know they exist and

play07:31

obviously there's some level of already

play07:33

solved nature to this in the industry

play07:37

and I I don't want to like just [ __ ] on

play07:39

something being recreated because

play07:41

obviously tiger beetle got created and

play07:43

it and it turned out to be incredible

play07:45

for TIG tiger beetle right you wouldn't

play07:47

want to use Cassandra you'd want to use

play07:49

Sila right Sila's the the way to go for

play07:50

any of those things uh Sila is just

play07:52

Cassandra but written in in a fast

play07:54

language not jbm um but 30,000 updates

play07:57

per second doesn't sound

play08:00

wild right this doesn't sound like for

play08:03

for larger companies I mean obviously

play08:05

for for smaller companies right for

play08:07

anyone with 20 or less Engineers this is

play08:10

probably uh probably really you know

play08:13

this would be a much harder thing to

play08:14

solve because now you have to actually

play08:16

have a dedicated staff to trying to

play08:18

solve these things maybe two dedicated

play08:20

staff trying to solve these things right

play08:21

which could be a real big hit to your

play08:23

bottom line but it doesn't seem like

play08:27

this isn't already there well you got

play08:28

you know you got man you got on call you

play08:30

got stuff you got things you got to

play08:31

think about right if you to try to go

play08:33

from a manage service to you managing a

play08:35

service I mean there's uptime

play08:36

requirements being you know setting up

play08:38

all the infrastructure and all that it's

play08:40

per

play08:41

node yeah per node okay so per per node

play08:44

per database H I'm still not sure I I

play08:47

mean hey I could be wrong remember tiger

play08:49

beetle came from necessity that I

play08:51

probably would not have understood and

play08:52

probably would have been like hey that's

play08:53

kind of silly to write your own database

play08:55

and then tiger beetle is absolutely

play08:56

amazing 30,000 privacy violations a

play08:58

second it doesn't look like privacy

play08:59

violations I actually don't think this

play09:00

is privacy violations at all because

play09:02

it's talking about tracking uh uh

play09:05

workers in sensitive uh areas and uh

play09:07

rental car stuff when you do rental cars

play09:09

of course you're you're getting tracked

play09:10

on that that makes perfect sense that

play09:12

the rental cars want to be able to

play09:13

ensure that they're you you did what you

play09:15

did right you guys made a contractual

play09:17

agreement that you D would drive their

play09:19

car like this they would get this out of

play09:21

it right right uh keep saying per node

play09:24

per node I know I see per node okay so

play09:26

per node that maybe that makes more

play09:28

sense 30,000 uh you know write

play09:31

operations per second uh unlimited

play09:33

parallelism multiple nodes need to be

play09:34

able to write data simult simultaneously

play09:36

with no upper limit small size on dis

play09:39

given the volume of data we need to make

play09:40

sure that it takes little space on the

play09:42

disk as possible to be fair when you

play09:44

have your own data format and this is

play09:47

all you need to store and you know that

play09:49

this isn't changing much it technically

play09:52

is most efficient to write your own

play09:54

bespoke storage for this specific

play09:56

operation but man you'd have to have

play09:58

like you'd have to really convince me

play09:59

that this is a good idea cuz it is I

play10:01

mean you really just hired five

play10:03

Engineers to do this right you can do

play10:05

yeah famous last words I know I I'm just

play10:08

saying this is crazy this is this is

play10:09

very interesting moderate performance

play10:11

from read uh reads from dis our servers

play10:14

uh our server is built around an

play10:15

inmemory architecture queries and

play10:17

filters uh for real-time streams run

play10:19

against data in memory and as a result

play10:20

are very fast reads from dis only happen

play10:22

when a new server comes online when a

play10:24

client uses the history API or soon when

play10:27

a user app rewinds time on on our

play10:29

digital twine interface these disc reads

play10:31

need to be fast enough for good user

play10:33

experiences but they are comparatively

play10:35

infrequent and low in volume okay so

play10:37

optimizing for rights okay low

play10:39

consistency guarantees we are okay with

play10:42

losing some data we buffer about 1

play10:44

second worth of updates twin our digital

play10:46

twin interface okay stop making fun of

play10:48

it I said twine okay whatever uh in the

play10:50

rare instance where a server goes down

play10:52

and another takes over we are okay with

play10:53

losing one second of location updates in

play10:55

the current buffer okay they even have

play10:57

full tolerance I I am curious why the

play11:01

other Solutions didn't work out for this

play11:03

uh what sort of data do we need to store

play11:04

the main type of entity that we need to

play11:06

persist uh is an object basically any

play11:08

vehicle person sensor or machine objects

play11:10

have an ID label location and arbitrary

play11:12

key value data uh for fuel levels or

play11:15

current Rider ID locations consist of

play11:17

longitude latitude accuracy speed

play11:19

heading altitude and and altitude

play11:21

accuracy though each update can only

play11:23

change a subset of these fields in

play11:26

addition we also need to store areas

play11:28

tasks something and object has to do and

play11:30

instructions tiny bits of spatial logic

play11:32

uh The Hive kit server executes based on

play11:35

the incoming data altitude yeah like how

play11:38

high up they are what we built we've

play11:40

created a uh purpose-built inprocess

play11:43

storage engine that's part of the same

play11:45

executable as our core server it uh

play11:47

writes a minimal deltaab based binary

play11:49

format a single entry looks like this

play11:52

okay entry length nice okay okay so this

play11:54

is starting to look like a TCP protocol

play11:56

it almost looks like something you could

play11:57

also just UDP over to another server

play11:59

kind of interesting kind of interesting

play12:01

that because given the fact that they

play12:02

don't need consistency they could

play12:04

technically UDP it on too many servers

play12:06

to be stored right it's got it's

play12:08

interesting right Flags ID uh type index

play12:11

timestamp okay uh latitude longitude

play12:16

um uh bite length data bite length label

play12:20

okay interesting sounds like Overkill it

play12:23

I

play12:25

mean I'm sure we're missing many pieces

play12:28

of information here that make this make

play12:30

sense but let's see they might lose data

play12:32

they said they're okay with losing data

play12:34

which I think is probably fine right uh

play12:37

UDP reliability doesn't work great in

play12:38

lossy Fabrics yeah that's fair uh each

play12:41

block represents a bite the two bytes

play12:43

labeled flags are the list of yes no

play12:46

switches that uh spec uh specify has

play12:49

altitude and has longitude has data

play12:51

telling our Purser what to look for in

play12:52

the remaining bytes of the entry we

play12:54

store the full state of an object every

play12:56

200 rights between those uh these we

play12:59

Only Store Deltas that means that the

play13:01

single location update complete with

play13:03

time and ID latitude and longitude takes

play13:04

only 34 bytes this means we can cram

play13:07

about 30 million location updates in a

play13:09

gigabyte of

play13:10

space okay okay very interesting I

play13:14

wonder how they I I wonder how they

play13:15

decide when to do like iframes right so

play13:17

iframes if you don't know what an iframe

play13:19

is iframe is the frame of video that's

play13:21

coming down that has the like the full

play13:24

position and then then you do p frames

play13:26

which P frames are the differentials

play13:28

right and there's also p frames which is

play13:29

like backward forward differentials but

play13:31

we're not going to talk about that uh

play13:32

and so like how often do you do diffs

play13:34

versus how often do you do full data

play13:35

point storage pretty interesting stuff

play13:37

it's kind of like they've created their

play13:38

own video encoding on top of spatial

play13:40

data right every 200 oh every 200 oh

play13:44

yeah you know when you say it out loud

play13:48

like that it makes perfect sense pread

play13:51

pre-read uh we also maintained a

play13:53

separate index file that translates the

play13:55

static string ID for each entry and its

play13:57

type it's uh to a unique 4 byte

play14:00

identifier uh since we know that it

play14:02

fixes uh that this fixed size identifier

play14:04

is always byte index 6 through n of each

play14:07

entry retrieving the history for a

play14:08

specific object is extremely fast

play14:11

interesting it's a b tree probably

play14:12

somewhere uh the result 98% reduction in

play14:15

Cloud cost and faster everything the

play14:17

storage engine is a part of our server

play14:19

binary so the cost of running it hasn't

play14:21

changed what has changed though is that

play14:23

we replaced our $10,000 a month Aurora

play14:25

instance with a $200 a month elastic

play14:26

block storage volume we are using

play14:29

provisioned iops SSD io2 with 3,000 iops

play14:34

and are batching updates to one right

play14:36

per second per node and realm I sure a

play14:38

lot of people are thinking something

play14:39

very similar that I am which is the

play14:42

engineering cost has to be significantly

play14:45

more than $10,000 a month for this to

play14:47

actually be written tested validated all

play14:51

that kind of stuff improvements you now

play14:55

have a binary storage which means you

play14:56

need versioning one thing I can see

play14:58

right away that they goofed up right

play14:59

here if you look at this the header does

play15:02

not have a version field like one of the

play15:04

biggest mistakes people make when they

play15:07

do novice binary encoding is not

play15:10

considering a version field this is like

play15:13

probably the single most important thing

play15:15

to do and the thing is is one bite is

play15:17

probably enough for your

play15:19

version uh field but two btes if you

play15:22

want to be super safe like real talk

play15:25

like how are they gonna know right how

play15:27

are they gonna know when they need to

play15:29

change the format right so for me the

play15:31

the version is a dead giveaway that this

play15:33

is maybe a bit more novice of a binary

play15:36

encoding uh attempt so each block

play15:38

represents a bite the two bite labels uh

play15:41

labeled flags are a list of yes no

play15:43

switches okay so there you go like has

play15:45

altitude has longitude has data so again

play15:48

again so this is what of again this is

play15:50

why you would desperately want a version

play15:53

field right here because again imagine

play15:56

if you needed more than 16 Flags all of

play15:59

a sudden you might find yourself needing

play16:02

to change your header format very very

play16:04

important to do uh let's see we start oh

play16:07

okay we already talked about that uh all

play16:08

right as a result so cool they've saved

play16:10

some money I'm very skeptical I mean

play16:12

obviously the layman in me says you save

play16:15

$10,000 a month but you might have cost

play16:17

yourself $50,000 a month of engineering

play16:20

effort which theoretically would get

play16:22

paid back but I think the thing that

play16:24

they're that they alluded to earlier

play16:25

makes this make a lot more sense which

play16:27

was they uh on on premise part there you

play16:30

go on premise version so they have an on

play16:33

premise version so maybe my guess is

play16:35

that this is what it's attempting to

play16:37

solve also is they have the on-prem kind

play16:40

of like thing that's kind of coming down

play16:42

on them that's actually causing a lot of

play16:44

difficulty why we added a a version

play16:46

field to our DB I really hope they do

play16:48

add a version field because honestly it

play16:50

is dude the the boy just bet that he

play16:52

could get it right first try and trust

play16:54

me every single binary protocol I've

play16:56

ever written in fact if you go to uh Vim

play16:58

with me Vim with me uh and we go in here

play17:01

and we look at what what do we got do we

play17:03

got do I have any word anything that's

play17:05

called encoding TCP

play17:08

uh the first thing I did when building

play17:11

our own TCP packets was put a version in

play17:13

it version is the first part of our

play17:15

encoding right this is Step requirement

play17:20

Numero Uno right because you just have

play17:23

to whenever you build anything that

play17:25

involves just raw dog and TCP or any of

play17:28

these type of stuff oh man you got to be

play17:29

ready for you to screw up you screwed up

play17:31

that's all there is to it you didn't

play17:33

foresee something it has to be step one

play17:36

uh EBS has an automated backups and

play17:38

Recovery built in in high high uptime

play17:40

guarantees so we don't feel that we've

play17:41

missed out on any reliability guarantees

play17:43

that Aurora offered we currently produce

play17:45

about 100 gigabytes of data per month

play17:47

but since our customers rarely uh query

play17:49

uh entries older than 10 days we've

play17:51

started moving everything above 30

play17:52

gigabytes to uh a AWS Glacier uh by the

play17:56

way Glacier I thought was one of the

play17:58

coolest

play17:59

characters and Killer Instinct for those

play18:00

that are just wondering thus reducing

play18:02

our EBS costs but it's not just costs

play18:04

writing to a local EBS via the file

play18:06

system is a lot quicker and has lower

play18:08

overhead than writing to uh queries have

play18:11

gotten a lot faster too it's hard to

play18:12

qualify or quantify since the queries

play18:15

aren't exactly analogous but for

play18:16

instance recreating a particular point

play18:19

in time in a realm's history went from

play18:21

around 2 seconds to 13

play18:23

milliseconds super cool again I feel

play18:27

like I've said this more than once which

play18:29

is creating your own storage for your

play18:32

specific needs will always be and if

play18:35

you're good at it of course assuming you

play18:36

have no skill issues will always be the

play18:38

single best way to store your data cuz

play18:40

it's bespoke to you but it is also

play18:44

absolutely the hardest most challenging

play18:47

probably you shouldn't do it move

play18:49

throwing that out there okay just saying

play18:52

you probably shouldn't do that but this

play18:54

is very very impressive cuz this is

play18:56

several orders of magnitude it's a

play18:57

nightmare of skill issues it is it is a

play19:00

nightmare to do and you should

play19:01

effectively never do it unless if you're

play19:03

tiger beetle or apparently these guys of

play19:05

course that's an unfair comparison after

play19:06

all postgress in general uh uh purpose

play19:10

let's see postgress is a general purpose

play19:12

database with an expressive query

play19:14

language and what we've built is just a

play19:15

cursor streaming a binary file feed with

play19:17

a very limited set of functionality but

play19:19

then again it's the extra it's the exact

play19:22

functionality we need and we didn't lose

play19:23

any features if they have started

play19:25

archiving everything after 30 GB then I

play19:27

probably would have started by keeping

play19:29

everything in Ram and buffering 3x

play19:31

machines yeah that's kind of interesting

play19:33

and then you could do that whole like

play19:34

UDP talk cycle right cuz once you hit

play19:36

the backbone you're you're probably

play19:38

going to get very low packet loss and

play19:40

you could just like have some sort of

play19:41

crazy round robin everything stays up

play19:44

have a node that can just kind of handle

play19:45

it at all times and then hit hit it with

play19:48

the cold storage the glacier afterwards

play19:50

we do not have skill issues the riter

play19:52

very very interesting yeah that feels

play19:53

like a little Overkill but still I mean

play19:55

it's super cool though let see you can

play19:57

learn more about Hive kits API and

play19:59

features oh cool okay so then they did

play20:01

they open source this as well location

play20:04

infrastructure for the internet hi kid

play20:06

provides okay when you say it this way

play20:08

it does make you feel like this is

play20:09

actually the NSA again okay I know you

play20:11

tried to trick me NSA with your European

play20:14

unit writing with periods instead of

play20:17

commas but you're not faking me this

play20:19

time I know what's happening here I know

play20:22

what's happening here you're trying to

play20:23

bamboozle me absolutely uh moral the

play20:27

story just use ETS constant in memory

play20:29

store built in erlang dude shy never

play20:32

misses an

play20:33

opportunity never miss an opportunity

play20:36

every every good story can be made

play20:38

better by mentioning erlang it's just a

play20:39

fact of life that was really cool though

play20:42

I actually really genuinely liked this I

play20:43

really genuinely liked this article

play20:45

again just please if I if somehow this

play20:48

gets out to you creators of said product

play20:51

um I didn't even realize that we are we

play20:53

were reading like an actual product with

play20:55

pricing and all this I didn't realize

play20:56

this till now is this an ad did I just

play20:58

get an ad can can you pay me money you

play20:59

want to pay me money for this uh no

play21:01

anyways please For the Love of All

play21:03

Things good and holy put a virgin field

play21:05

in here you will never be sad that you

play21:07

put it and you will always be sad that

play21:08

you didn't have it hasht did we just add

play21:11

ourselves did I just give you guys some

play21:12

AIDS I think I might have SL NSA NSA if

play21:15

you could pay me I don't know what I

play21:18

don't know how much money I'd need from

play21:19

the NSA to sell out my soul for being

play21:21

able to track people but I mean I assume

play21:24

we're getting the bag

play21:27

NSA all right hey that was awesome now

play21:31

let's talk about being un unplayed the

play21:41

name is the database aen is the is the

play21:43

binary aen

Rate This

5.0 / 5 (0 votes)

Related Tags
Cloud CostDatabaseData ManagementCustom StorageGeospatial DataPerformanceEfficiencyEngineeringBinary FormatInnovation