Introduction to HashiCorp Vault with Armon Dadgar
Summary
TLDRThe transcript discusses Vault, a solution for secret management, addressing the problem of securely handling credentials like usernames, passwords, API tokens, and TLS certificates. It highlights the issue of secret sprawl, where secrets are scattered across infrastructure, and emphasizes the need for centralized, encrypted storage with fine-grained access control and audit trails. Vault offers dynamic secrets, reducing the risk of credential leaks, and provides encrypt-as-a-service for better key management, ensuring correct cryptography implementation and offloading key management tasks. The architecture of Vault is highly pluggable, supporting various authentication backends, audit backends, and secret backends, allowing for flexibility and scalability.
Takeaways
- π Vault is designed to solve the secret management problem by centralizing the storage and control of various credentials like usernames, passwords, API tokens, and TLS certificates.
- π The concept of 'secret sprawl' refers to the uncontrolled distribution of secrets throughout an infrastructure, often in plaintext, leading to security vulnerabilities.
- π Vault encrypts all secrets both at rest and in transit, ensuring that even if access is gained, the secrets are protected and unreadable without proper decryption.
- πͺ Fine-grained access control allows Vault to limit which credentials are accessible to specific clients or users, preventing broad and potentially dangerous access to sensitive information.
- π Audit trails in Vault provide visibility into the usage and handling of credentials, allowing for accountability and the ability to trace any misuse or leaks back to their source.
- π Vault supports dynamic secrets, issuing short-lived, ephemeral credentials to applications, reducing the risk of long-term exposure if credentials are compromised.
- π― Each dynamic secret is unique to the client it's issued to, allowing for precise identification and isolation in the event of a security breach.
- π‘οΈ Vault's encrypt-as-a-service feature offloads the responsibility of cryptography from applications, ensuring that encryption and decryption are handled correctly and securely.
- π§ Vault's architecture is highly pluggable, with authentication backends, auditing backends, and storage backends that can be customized to fit various environments and use cases.
- π For high availability, Vault instances can be run in a cluster with a shared backend, using leader election to ensure that requests are always processed by an active node.
Q & A
What is the primary problem that Vault aims to solve?
-Vault primarily aims to solve the secret management problem, which involves securely managing and controlling access to various credentials such as usernames, passwords, database credentials, API tokens, and TLS certificates.
What are some challenges associated with secret sprawl?
-Challenges with secret sprawl include the difficulty of knowing who has access to the secrets, the lack of an audit trail to track usage, and the complexity of rotating secrets when they are hardcoded in source code or scattered across multiple systems.
How does Vault address the issue of fine-grained access control?
-Vault addresses fine-grained access control by centralizing secrets and overlaying access control policies, allowing for precise control over who can access which credentials and providing a clear audit trail of actions taken.
What is the concept of dynamic secrets in Vault?
-Dynamic secrets in Vault refer to the practice of providing short-lived, ephemeral credentials to applications instead of long-lived credentials. This limits the potential damage if a secret is leaked, as the leaked credential is only valid for a limited time and can be easily revoked and replaced.
How does Vault help with the management of encryption keys?
-Vault offers an 'encrypt as a service' capability, which allows it to manage encryption keys and perform cryptographic operations on behalf of applications. This ensures that cryptography is correctly implemented and that key management is offloaded from developers to Vault, simplifying the process and reducing the risk of errors.
What are the three major challenges that Vault is designed to help developers with?
-The three major challenges are: 1) Moving credentials out of plaintext and into a centrally managed system with tight access control and clear visibility; 2) Protecting against applications that aren't trusted to keep secrets by using dynamic secrets; 3) Helping applications protect their own data at rest through key management and high-level cryptographic offload.
How does Vault's architecture contribute to its flexibility?
-Vault's architecture is highly pluggable, with different extension points such as authentication backends, auditing backends, and secret backends. This allows Vault to integrate with a variety of identity providers, audit log systems, and storage systems, and to manage a wide range of secrets through the addition of new secret backends.
What are some examples of secret backends in Vault?
-Examples of secret backends in Vault include key-value stores for static credentials, database plugins for dynamic management of database credentials, RabbitMQ for message queue credentialing, AWS for managing cloud resource access, PKI for certificate management, and SSH for brokering access to SSH servers.
How does Vault ensure high availability in a deployment?
-Vault ensures high availability by running multiple instances of the service, using a shared backend storage system, and performing leader election to designate an active leader that processes client requests. If the current leader fails, a new leader is automatically promoted to take over operations.
What type of API does Vault typically expose for client interactions?
-Vault typically exposes a RESTful JSON API over HTTP, making it easy to integrate with applications and allowing clients to interact with it using standard HTTP methods.
Why is it important for Vault to encrypt secrets both at rest and in transit?
-Encrypting secrets both at rest and in transit ensures that even if someone gains access to the storage location or intercepts the communication, the secrets remain secure and unreadable without the decryption keys. This is a fundamental aspect of Vault's security model.
Outlines
π Introduction to Secret Management and Vault
The paragraph introduces the concept of secret management, emphasizing the importance of managing credentials that grant access to systems. It explains that secrets can include usernames, passwords, database credentials, API tokens, and TLS certificates. The issue of secret sprawl is highlighted, where secrets are scattered across infrastructure, often in plaintext, leading to security risks. The introduction of Vault is presented as a solution to centralize and encrypt secrets, providing fine-grained access control and audit trails, thus improving the management and security of sensitive information.
π Challenges with Secret Management and Dynamic Secrets
This paragraph discusses the challenges of managing secrets, particularly the tendency of applications to leak credentials. It introduces Vault's second-level capability of dynamic secrets, which are short-lived and ephemeral, reducing the impact of a leak by limiting the lifespan of credentials. The benefits of dynamic secrets include creating a moving target for attackers, unique credentials per client, and improved revocation capabilities. Additionally, the paragraph addresses the challenge of protecting application data at rest and introduces Vault's encrypt-as-a-service feature, which offloads cryptography implementation and key management from developers to Vault, ensuring secure and correct implementation.
π οΈ Vault's High-Level Architecture and Pluggability
The paragraph delves into Vault's architecture, highlighting its pluggability and the ability to fit into various environments. It outlines the core components of Vault, including authentication backends for client identification, auditing backends for tracking activities, and storage backends for durable and available data storage. The paragraph also explains the secret backends that allow for the management of different types of secrets, either static or dynamic, and how Vault's architecture enables the secure and efficient handling of credentials and encryption keys through its RESTful JSON API.
π Understanding Vault's Deployment and Operation
The final paragraph provides an overview of how Vault operates in a deployment context, emphasizing its high availability through multiple instances and leader election. It explains how Vault functions as a shared network service accessible via a RESTful JSON API, allowing clients to interact with it seamlessly. The paragraph concludes by encouraging further exploration of Vault's resources for a deeper understanding of its capabilities and applications.
Mindmap
Keywords
π‘Secret Management
π‘Vault
π‘Secret Sprawl
π‘Encryption
π‘Access Control
π‘Dynamic Secrets
π‘Audit Trail
π‘Pluggable Architecture
π‘Authentication Backends
π‘Storage Backends
π‘Secret Backends
Highlights
The introduction of Vault as a solution to the secret management problem.
Definition of a secret as any credentials that grant authentication or authorization to a system.
The issue of secret sprawl, where secrets are scattered across infrastructure and source code.
The importance of understanding who has access to secrets and the lack of audit trails in many systems.
The centralization of secrets as a solution to secret sprawl, with Vault promising encryption both at rest and in transit.
Vault's ability to overlay fine-grained access control, enhancing security and visibility.
The concept of dynamic secrets, providing short-lived and unique credentials to applications to reduce the risk of credential leakage.
The advantage of dynamic secrets in creating a moving target for attackers and facilitating pinpointing the source of a compromise.
Vault's role in better revocation management, allowing for the isolation of compromised credentials.
The challenge of applications storing data and the need for effective encryption key management.
Vault's encrypt-as-a-service feature, offloading the responsibility of cryptography from developers to the platform.
The high-level API provided by Vault for cryptographic operations, ensuring correct implementation and key management.
Vault's high pluggability, with multiple extension points for authentication, auditing, and storage backends.
The use of authentication backends to allow clients to authenticate from various systems.
Vault's architecture supporting multiple audit logs for comprehensive tracking of activities.
Storage backends responsible for durable and highly available storage of secrets.
Secret backends enabling dynamic secret capabilities and extending Vault's functionality for various use cases.
Vault's operation as a shared network service with a RESTful JSON API for easy integration with applications.
Transcripts
i'm armand and today i want to spend
some time talking about vault so when we
talk about vault the problem we're
really talking about solving is the
secret management problem and so when we
start talking about secret management
the first question that naturally comes
up is what is a secret so when we talk
about secret management what we're
really talking about is managing a set
of different credentials right and so
what we mean when we talk about these
credentials is anything that might grant
you authentication to a system or
authorization to a system right so some
examples of this might be usernames and
passwords it might be things like
database credentials it might be things
like API tokens or it might be things
like TLS certificates the point is any
of these things we can use these to
either login to a system and
authenticate such as a username and
password or we're using to prove our
identity sup like a TLS certificate and
so we're using it to authorize access
potentially so all of these things fall
in the realm of secrets and these are
things they want to carefully manage we
want to understand who has access to
them we want to understand you know
who's been using these things and in the
case of most of these we want some story
around how we can periodically rotate
these and so when we look at the kind of
state of the world of how these things
get managed in practice what we see a
secret sprawl right and what we mean by
secret sprawl is that these end up
everywhere they're in plain text inside
of our source code so maybe it's
hard-coded in a header what the user
name and password it is it ends up
inside of things like configuration
management so again this is living in
plain text and chef or puppet or ansible
and so anyone can log in and see what
these credentials are and ultimately all
of this typically ends up living in a
version control system like github or
get lab or bitbucket and so these things
end up sort of strewn about or sprawled
all over our infrastructure and so what
are the challenges with this world well
we don't really know who has access to
all of these things so we don't know
does anyone in our organization with
access to github can they log in and see
the source code and thus see what our
database credentials are right and even
if they could do it we don't know if
they have done it we have no audit trail
that says just because I Arman could
have seen that secret did he go on and
so we really have no fine-grained
ability to manage who has access or to
even audit who's done what with it were
skin how do we actually rotate any of
these things so if we realize we do need
to change our database credential
there's been a compromise or we're doing
a periodic rotation it's very very
difficult if we're in a place where it's
hard-coded in our source code or its
Stern about in so many different systems
that it's really difficult to really
know how to effectively do this rotation
and so this state of the world is what
we refer to as secret sprawl and so one
of our first goals and we started
working on vault was to really look at
this problem and say how can we improve
it and so this is really where vault
came from so vault really starts by
looking at the secret sprawl problem and
saying we can only solve it by
centralizing right so instead of having
things live everywhere we move all these
secrets to a central location and vault
promises that we're going to encrypt
everything both at rest inside a vault
as well as in transit between vault and
any of the clients that want to access
it right and so this gives us a few
properties one unlike these systems
where we're restoring the stuff in
plaintext at least now if you could see
where the secret is stored at rest it's
encrypted
so you don't get implicit access to just
be able to see this secret the next
thing is vault lets us overlay
fine-grained access control on top of
all this so instead of it being anyone
in our organization who has access to
github and can see the source code now
we can go much more fine-grained and say
you know the web server needs access to
the database credential the API server
needs the API tokens but everyone
shouldn't have access to everything and
then on top of this we have an audit
trail so now we can actually see what
credentials is a web server access what
credentials did Arman access from the
system and so we have much more
visibility and control over how these
things are all being managed this is
sort of the level one challenge with
vault is at least moving from a world of
sprawl where things are everywhere to
world of centrality where we have sort
of strong guarantees that it's encrypted
strong guarantees around who has access
and strong visibility into this so this
becomes our first level thing the next
level challenge becomes realizing who
we're giving these credentials to right
so great we've store all these
credentials safe
in volt and now we're gonna thread these
out and provide it to an application the
challenges applications do a terrible
job keeping secrets inevitably the
application will log its credentials out
to a logging system so I might write it
out to standard out this gets shipped
off to Splunk and is now in a central
log that anyone can see it shows up in
any sort of a diagnostic output so maybe
our application has an exception and it
shows the username and password and the
trace back or inside of an error report
it might be shipping it out to external
monitoring systems when there is an
error and so in general what we find is
applications do a poor job keeping
things secret so even if we do a great
job centralizing it and strongly
controlling it and encrypting it on the
way to the application the app isn't
trusted so one of the second-level
capabilities of all introduces is what
we call dynamic secrets and the idea
behind a dynamic secret is instead of
providing a long-lived credential to the
application which it inevitably leaks we
provide short-lived ephemeral
credentials so these things are
dynamically created but they're
ephemeral so we might only give a
credential to an application that's
valid for say you know 30 days and the
value of this is a few fold now even if
the application leaks this credential
out it's only valid for a bounded period
of time so it might write it to a
logging system and that becomes visible
but we create a moving target for an
attacker by constantly revoking and
issuing new certificates the other thing
that's valuable is now each credential
is unique to each client so previously
if I had 50 web servers all of them
would come in and read a static database
credential and so this means if there's
a compromise and that database
credential gets out it's very hard to
pinpoint where the point of compromise
was there's 50 servers they're all
sharing the exact same credential versus
in a dynamic secret world each of those
50 web servers had a unique credential
so you know very specifically web
machine 42 was the point of compromise
right the last thing that this lets us
do is have a much better revocation
story so now if we know web machine 42
was our point of compromise we can
revoke the password username and
password for just web machine 42 and
isolate that leak but if all 50 machines
were sharing the same username password
the moment we try and revoke it would
cause the entire service to have an out
right so the blast radius of a
revocation is much larger when you have
a shared secret versus the dynamic
secret the third challenge we found was
that applications are often storing data
ultimately and so the challenge becomes
right how do the applications protect
their own data at rest because we're not
going to be able to store all sort you
know all information with involved well
is meant just to manage secrets not
anything that might be confidential so
what we often see is that one is vault
as being used as a centralized sort of
secret management store people are
storing encryption keys so we might put
an encryption key inside a vault and
then distribute that key back out to the
application the application is doing
cryptography to protect data at rest
what we find though is applications
generally don't implement cryptography
correctly there's lots of subtle nuances
and it's easy to get wrong and with
these kind of you know mistakes often
times it compromises the whole
cryptography when those mistakes are
made and so one of the challenges we
often look at is how do we get away from
ultros storing an encryption key and
handing it to the application and
assuming the app will do cryptography
right so this has evolved into a
capability that vault calls encrypt as a
service and the idea here is instead of
expecting that we're just going to
deliver a key to a developer and the
developer implements cryptography
correctly volt will do a few things one
is it will let you create a set of named
keys so I might create a key that I call
you know credit card information and a
separate one I call a social security
number and one for PII and these are
just names I'm gonna just name this key
and I'm not going to actually give this
value out but then what we expose is a
set of high-level API is to do
cryptography so these API so be kind of
the classic operations you expect right
things like encrypt or decrypt or sign
or verify so now as a developer what I'm
really doing is calling volt with an API
and saying you know I want to do an H
Mac using my credit card key and some
piece of data right and what volt is
shielding is the implementation is being
provided by volt so we don't have to
trust that the developers implement at
these hilum
operations correctly and the key
management is also being provided by
volt the developer never actually sees
the underlying key this lets us do a few
things one it ensures that the
cryptography is correctly implemented
because we're using a vetted
implementation by volt this
implementation is vetted both by us by
the open source community and by
external auditors that we use it also
lets us offload key management so if we
think her prog rafi is hard key
management's even harder and so in
practice when you ask how many
applications properly implement key
versioning key rotation key
decommissioning and the full lifecycle
of key management the answer is very few
because it's challenging but by
offloading this to vault we can actually
use high-level api's to do all of this
so we get the full key lifecycle as well
provided by vault and so in practice
these end up being the three major
challenges that we're trying to help
developers with right how do we move
these credentials out of plain text and
sprawled across many different systems
into a scenario where they're centrally
managed with tight access control and
clear visibility - then how do we go
even further and protect against
applications that aren't necessarily to
be trusted in keeping secrets and we do
this by being ephemeral so we create
this moving target where what we're
really managing is that the web server
should have access to the database and
that credential that enables it is
dynamic is a dynamic one instead of
static and then lastly how do we go
further in helping the application
protect its own data at rest and that's
done through a series of key management
and high level cryptographic offload so
these three are kind of the core
principles of vault so now maybe we'll
zoom in quickly and talk a bit about
high level architecture of how does this
actually get implemented so when we talk
about vault architecture there's a few
important things to realize one is the
vault is highly pluggable it has many
different plug-in mechanisms so when we
talk about vault it has the central core
which has many responsibilities
including sort of the lifecycle manager
and ensuring requests are processed
correctly and then there's many
different extension points that allow us
to fit it into our environment so the
first one that's extremely important is
the authentication backends these are
what allow vault to allow clients to
authenticate from different systems so
for example if we're booting an ec2 VM
this ec2 VM might Ascenta gate
using our AWS authentication plugin this
plugin allows us to tie back into
Amazon's notion of identity to prove
that the color is for example a web
server but if we're have a human user
they might be coming in and using
something like LDAP or Active Directory
to prove their identity if we're using a
high level platform maybe something like
kubernetes we might be using our
kubernetes authentication provider and
the goal of these authentication
providers is to take some system we
trust whether it's kubernetes LDAP or
AWS and use this to provide application
or human identity at the end of the day
that's what we're getting out of this is
a notion of the identity of the caller
this is great and then we use that to
connect to an auditing back-end which
allows us to connect and stream out
request response auditing to an external
system that gives us a trail of who's
done what so this might be you know
Splunk as an example where we're going
to send all of our different audit logs
vult will allow us to have multiple
different audit logs so we can also sent
a Splunk as well as a system like syslog
as an example the next level challenge
is where does vault actually store its
own data at rest right so if we're gonna
read and write secrets to vault it needs
to be able to store these things
somewhere and so these are what we call
storage backends so storage back ends
are responsible for storing data at rest
so this can be really a couple of
different things it could be a standard
our DBMS so you know my sequel Postgres
it could be a system like console it
could be a cloud managed database like
google spanner but the goal of these
back-end systems is to provide durable
storage in a way that's highly available
so we can tolerate the loss of one of
these back-end systems the last bit is
how does console actually I'm sorry
vault provide access to different
secrets these are the secret backends
themselves and so these come in a few
different forms so the biggest use of
these is to enable the dynamic secret
capability we talked about before so one
form of secret back-end is a simple one
it's just key value so I might just
store a static username and password in
there and I'm giving it a username and a
password and these things are static and
this is just a key value store that's
encrypted at rest
however as we get more sophisticated we
might want to use the dynamic secret
capability we talked about and so that
is where these different plugins are
coming in so we have different database
plugins it's a database plugin will
allow us to dynamically manage my sequel
and Postgres and Oracle and etc
credentials we have things like RabbitMQ
so maybe we're doing dynamic
credentialing for our message queues but
this kind of goes on you can even apply
the same principle to something like AWS
we might have applications that need to
read and write from s3 but we don't want
to give them long-lived access to iam so
instead we define a role in our AWS back
in and we'll go and dynamically generate
short-lived credentials as needed so
this extends that sort of dynamic secret
paradigm so this is an extension point
that allows both to apply this same
principle to many different things one
common use of this is PKI so in practice
certificate management tends to be a
nightmare and what we often see is very
long live certificates maybe five to ten
year lived certificates because we don't
want to go through the process of
generating them versus with a vault we
can define them and programmatically
generate it so in practice people use
very short live certificates maybe a
shortest 72 24 hours and this way you're
constantly moving and creating a movie
target this list sort of goes on and
includes things like SSH as an example
so we can broker access to SSH as well
so you don't have a single PEM to rule
them all across a large state of
machines so at its core this is what
makes fault so flexible right it allows
fault to manage clients that are
authenticating against a different set
of identity providers we can audit
against a variety of different trusted
sources of log management we can store
data in almost any durable system and
then we can extend the surface area of
what types of Secrets can be either
statically or dynamically managed by
adding new secret backends so this
becomes a vault in a single instance
nutshell so as we talk about running a
vault instance each instance of it is
one of these and then in a broader
deployment what this will look like is
we run multiple instances to provide
high availability so at the highest
level we'd have a shared back-end for
example this might be
console which internally is you know
three different servers as an example
providing us a che and then we will run
multiple vaults in front and what fault
does is he'll coordinate with the shared
back-end to perform leader election so
one of these might be elected our
current leader and so as a client when
we're making requests we're talking to
the leader and even if we talk to sort
of a non leader will be transparently
forwarded to the active leader and so in
this way if any particular node dies
power loss process crashes you know
maybe network connectivity as an issue
we will detect this detect and promote a
new one to leader automatically and this
instance takes over active operation and
our other secondaries will begin to
promote and so this is what volt looks
like at a high level it operates as sort
of a shared network service and we're
talking to it just as an API client over
the network
so what volt typically exposes it's a
restful JSON API so it's JSON over HTTP
making it relatively easy to actually
integrate with our applications I hope
this was useful as a high-level
introduction to volt and please check
out our other resources to learn more
thank you
Browse More Related Video
Functionality and Usage of Key Vault - AZ-900 Certification Course
HashiCorp Vault Secret Engine and Secret Engine path - Part 4 | HashiCorp Vault tutorial series
Cloud-native authorization standards
How to Propagate Secrets Everywhere with External Secrets Operator (ESO) and Crossplane
HashiCorp Vault Installation - Part 1 | HashiCorp Vault tutorial series
Azure DevOps Workload Identity Federation with Azure Overview. NO MORE SECRETS!
5.0 / 5 (0 votes)