Hitchhiker’s Guide to HashiCorp Vault in Kubernetes Part 1: System integration
Nowadays, we get a lot of benefits from the dynamicity and scalability of cloud environments. This dynamic nature makes the task of keeping confidential information between systems an even more important part of our ever-changing platforms. Vault is the new approach to an old problem. An attempt to do it right. Yet there are still some barriers in the adoption. This post is a practical guide to the concepts, implementation details and best practices to get it up and running in Kubernetes.
What is a secret?
A secret is a set of credentials that can identify you. We call this process authentication. From there, the system is able to determine what you can and cannot do. This is called authorization.
These sets of credentials can be pairs of username and password. With these, you can create or drop a table in a database, or you can query for data in 3rd party services. They can also be API tokens, which are a good replacement for a pair of username and password, with the added bonus of having the ability to be revoked and rotated independently from the credentials of the main account. Another, more discrete type, are TLS certificates, which allows services to communicate between one another and guarantee that requests are legitimate.
Secrets are everywhere, and it’s easy to misuse them while storing and distributing them.
Problems with storing and distributing secrets
It’s not uncommon to find secrets in plain text on physical or digital sticky notes. But more often than that, secrets can be found embedded within code. It’s convenient to have your logic and data at the same place and it makes the configuration for different environments much easier. One of the most common places secrets can be found in are the environment variables. As useful as it is for debugging, it will often end up in between the log lines. If there’s only one thing you should remember from this post, it would be this: don’t put secrets in the environment variables.
Because it’s hard to create, rotate and manage secrets for every user, everyone shares the same set of credentials. But this makes it difficult to know who’s done what and when.
Fortunately, Vault is here to help.
What is Vault?
Vault is a central secret manager: your data is encrypted while at rest and in transit, but it’s distributed with detailed permissions to each user. Not only can Vault protect its own data, but it can also protect other services’ sensitive data. In other words, it provides encryption as a service.
Vault’s Components
Vault is divided into multiple components, each communicating with the others using encrypted RPC. This makes for a strong core with the audited implementation of encryption algorithms, yet makes the external parts flexible and swappable between providers and data sources. Understanding these parts is the key to understanding how to set up and use Vault to fit your system’s needs.
Auth Methods
The flow starts with the Authentication component, allowing you to say who you are with various ways: it can be a plain token, or a username and password combination, the most basic form of authentication. Vault can also delegate a trusted third-party source to authenticate, for example, Github, or LDAP and OIDC for enterprise users
Core
The core is where the encryption algorithms are implemented, and where permission routing decisions take place. From there, the core is able to check the permissions attached to your authentication information. This will say what you can do.
Secrets
At this point, you are able to reach your destination, what you came for, the secret. It can be as simple as pairs of key values, or it can be created dynamically from a database engine, or third party service. This is also a good place to generate and store TLS certificates.
Storage
To store tokens, policies, configurations, we need the storage layers. Thanks to the flexible plugin interface, we have multiple options here: Consul, DynamoDB, MongoDB… Choose the one that fits your system.
Audit
Last but not least comes the audit log. This will record every request and response coming in and out of the system, creating a record of what goes on within. This is so critical to a highly secured system that Vault is designed to refuse any incoming request if the audit log fails.
We can think of this as a bank. In this example, the secret is money. Customers are coming from the authentication gate. The receptionist checks for the right customer and the right amount of money to give or take. At the gate, the security guard writes down who comes in and out. If the guard is missing, the bank closes and everyone goes home.
Root token and Shamir’s Secret Sharing
A common problem with encryption is that you have to secure the master key that locks the whole safe, raising the question, “who will keep the final key?”
Vault uses an algorithm called Shamir’s Secret Sharing to split the master key into multiple smaller pieces called key shards. Later, we can reconstruct the master key when we meet the threshold of required key shards offering flexibility in the case that not all key shard holders are available.
If you’re interested in learning more about the algorithm, the Art of the Problem channel offers a great explanation.
Secret-retrieving mechanisms
Retrieving a secret is simple: the client exchanges a Vault token for a secret:
The client sends a secret path it wants with a vault token to Vault. Vault will check the permissions of that token and return the requested secrets when allowed. That’s it.
This can be done with this command:
curl \
-H “X-Vault-Token: s.SAuQmfl3bPIDUBQhvUOx1Anx” \
-X GET \
http://127.0.0.1:8200/v1/secret/foo
So the only question is: How can we get the token in the first place? We could create tokens for each user and application, but that would push the security issue from one place to another. Even worse, if this token is leaked, then all secrets bound to this token get exposed as well. A better solution is to generate short lived tokens and make the process automated. Kubernetes integration makes this part much smoother.
Integration with systems
Every application that runs inside a Kubernetes cluster has a Service Account attached to it. If not specified, it’s the default account with no permissions. First, we create a Service Account for each application, so that we can separate applications with different sets of permissions.
That’s the theory, but how can we get this for every application? At Picnic, we have applications written in Java, Python, Golang or even just binaries. We developed a solution to work independently from languages and frameworks. It uses a Kubernetes feature called init container, which is a container that will run before the main application starts.
This is our preferred way of retrieving and using secrets since they’re all in the memory of the application, and thus not exposed to a file on the disk or an environment variable. This answers the question: when should a secret not be secret? When it’s actually used. If the application is the only one to use it then let the application be the only one to know the content of the secrets.
# inside one container, Service Account token is located
# at /var/run/secrets/kubernetes.io/serviceaccount/tokenKUBE_TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)# we exchange Kubernetes token for a Vault token bound
# with application rolecurl -X POST \
--data '{"jwt": "'"$KUBE_TOKEN"'", "role": "app-role"}' \
http://vault:8200/v1/auth/kubernetes/login \
| jq -j '.auth.client_token' > /etc/vault/token# With Vault token we can retrieve secrets in the main app.
# In this demo, main application is in Bash as wellVAULT_TOKEN=$(cat /etc/vault/token)curl -h "X-Vault-Token: $VAULT_TOKEN" \
http://vault:8200/v1/secret/foo | jq '.'
But what about applications where we don’t have control over the source code, such as legacy applications or systems’ applications where we usually only receive binary code. For these, we add another init container to use Kubernetes’ APIs to create Secret objects. These objects will later be mounted as a file or (if we can’t avoid it) as environment variables.
The integration of applications that are running in Kubernetes proved that using Vault is viable. But to make it a company standard, many interactions, between multiple teams and stakeholders, will be required to help add and update secrets. In a security chain, the weakest link is usually the parts that require human interaction. In the next part, we will see how Infrastructure-as-Code and templating tools can help us overcome this challenge.
Recent blog posts
We're strong believers in learning from each other, so
our employees write about what interests them.