What is Secret Sprawl and How to Solve It?

Chances are that if you're in the security/devops crowd or building any complex infrastructure, then you've encountered an industry-wide problem called secret sprawl.

In this article, we explore what secret sprawl is and how you can solve it.

What is a secret?

To begin, we need to define the word "secret." A secret, in this context, is any sensitive data like API keys, database credentials, TLS certificates, encryption keys, and environment variables that need to be stored and handled securely for your application/infrastructure.

In most cases, secrets allow applications to access systems. For example, credentials may allow an application to read/write data from/to a database, S3, so on and so forth. Given that secrets are frequently used for authentication and/or authorization, they should be secured properly, away from bad actors who may otherwise use the secrets to gain access to critical systems and data.

With this in mind, let's discuss "secret sprawl".

What is secret sprawl?

The term secret sprawl refers to a team's inability to effectively oversee secrets sprawled across their infrastructure. As teams deploy increasingly complex services across multiple environments (e.g. development, staging, production) with each service depending on one or more external services, they accumulate hundreds, sometimes thousands, of infrastructure secrets, kubernetes secrets, etc.

Having to manage many secrets creates significant room for security risk, misconfiguration, and ultimately error. From difficulty guaranteeing proper handling and encryption of secrets to pinpointing where a specific secret is and not knowing who has access to it to forgetting to create/update/delete secrets across different stages of the software development lifecycle, plenty of issues arise as a result of secret sprawl.

The following associated scenarios may sound familiar:

You introduce a new environment variable in development and forget to inform other developers on the team about the update. They start developing only to find out that the application crashes due to the missing variable and subsequently lose precious time debugging the missing value.
You introduce a new environment variable in development and staging but forget to add it to production. After re-deploying the production service, you find that the service breaks for everyone.
A secret accidentally gets leaked somewhere in your development cycle or even across multiple locations, and you realize that you don't have the capabilities to trace back the story of what transpired to act accordingly. This includes where, when, and how the secret got leaked as well as who leaked it.
A secret accidentally gets leaked and you scramble to revoke it in time because it's not immediately clear where lies within your complex infrastructure, hence the sprawl. Beyond this, you find yourself losing precious time rotating the secret.

Overall, secret sprawl makes it difficult to effectively manage secrets which leads to unnecessary risks and inconveniences.

The solution to secret sprawl

To solve secret sprawl, you want to centralize your secrets management that is to have one unified view and management portal for secrets across your infrastructure. More specifically, you want to use a dedicated tool called a secret manager to store and streamline secrets to applications and infrastructure. With it, you can store and query sensitive data from into your applications.

You can think of a secret manager as a special type of secure database where data is tightly controlled, encrypted, and audited. While the core functionality is the storage and distribution of secrets, secret managers have unique secret-specific features like lifecycle management capabilities including versioning, rotation, revocation, and even dynamic secrets. On a side note, some secret managers allow for you to attach different "storage backends." For instance, you may configure them to hook up to PostgreSQL, MySQL, S3, and more.

Using a secret manager helps resolve the issues described in the previous section and is an important building block to any security program. Most obviously, it ensures that services across your entire development cycle receive the right secrets at all times, this includes local development where developers frequently use .env files as an alternative. Meanwhile, it not only provides a detailed record of all secret management activities, so you know the full story of any incident immediately, but also gives a simple mechanism to rotate the secret instantly in the event that is required.

How can I fit/implement it into my stack?

As a tool designed to serve the entire development cycle, a secret manager should sit adjacent to your stack and deliver secrets across your infrastructure from local development to CI/CD pipelines to production. The general process to start using a secret manager typically involves:

Setting up the secret manager: If using a cloud service like AWS secret manager, this may involve choosing a region; otherwise, it may mean following steps to self-host a secret management solution on-prem. In this step, you typically configure access controls for what entities should be able to access the secret manager and add secrets to it.
Fetching secrets back to your application/infrastructure: Having set up your secret manager, you can now configure your applications (often referred to as clients) to query for secrets from the secret manager. Each secret manager has its own preferred methods for fetching secrets such as via API call, SDK, CLI, or even K8s operator with tight API access.

What secret manager should I use?

The decision behind what secrets management solution to use depends largely on your team's or organization's requirements. For instance, you may consider whether or not you want to use cloud services like Azure Key Vault or go for an on-prem open source solution like HashiCorp Vault. Going deeper, when considering a cloud solution, you may think about whether or not encryption-at-rest is enough or if you require a solution that offers end-to-end encryption.

Beyond this, you may consider the feature set of each secret manager and whether or not that is sufficient for your organization. For example, some solutions offer path-based secret storage, a feature useful for orchestrating the most demanding architectures; others offer a birds-eye view across multiple environments for you to conveniently compare and pinpoint any discrepancies at a glance; others may also include features like secret scanning and leak prevention in the bundle. Finally, you may consider the cost and difficulty (i.e. learning curve) of setting up a secret manager.

Having mentioned some basic considerations, you should consider Infisical, the open source secret management platform with great developer experience. It offers a managed service called Infisical Cloud but also has a self-hosted version that you can prop up on your own infrastructure with support for end-to-end encryption.

Conclusion

Secret sprawl is an industry-wide issue involving challenges associated with the inability to effectively oversee secrets sprawled across applications and infrastructure. Such challenges range from difficulty knowing where secrets are located to being unable to swiftly revoke and rotate them in time, not to mention not knowing the full story of every secret; the result is poor secrets hygiene.

That said, by adopting a dedicated tool called a secret manager like Infisical, you can manage your secrets from one unified place and ultimately prevent secret sprawl.