logo
Infisical
Blog post 6 min read

How to manage secrets in Databricks?

Published on
Blog image

The MLOps market is exploding and projected to reach $39 billion by 2034; however, there’s a constant risk in every data stack: secrets. While organizations pour millions into data infrastructure, their work can be undone by a single compromised credential that can cascade into a catastrophic breach.

More specifically, this is an issue of scale—especially for data workflows. Data workflows often require credentials for 10-50+ external services. With data protection spending expected to grow significantly and compliance standards like SOC 2, GDPR, and HIPAA carrying severe penalties, organizations need robust secrets management strategies that go beyond basic credential storage.

Databricks offers its own native solution, which we will explore in this piece.

The Fundamentals

What Are Secrets?

Secrets are sensitive authentication credentials that applications need to access protected resources. In data environments, these include:

  • Database credentials: Connection strings for PostgreSQL, MySQL, Snowflake, and enterprise data warehouses
  • Cloud storage access: AWS S3 keys, Azure storage accounts, GCP service accounts
  • API authentication: Keys for Salesforce, external data providers, and REST endpoints
  • Infrastructure credentials: Service principals, SSH keys, TLS certificates, container registry tokens

What Is Databricks?

Databricks is a unified analytics platform that serves as the central processing engine for modern data stacks. Built on Apache Spark with Unity Catalog providing governance, Databricks integrates with virtually every component of the modern data ecosystem, from ingestion tools like Fivetran to visualization platforms like Tableau.

This central role creates extensive credential requirements. Databricks must authenticate with data sources, cloud storage, APIs, and downstream analytics tools, while supporting collaborative workflows where multiple teams share workspaces and resources.

Databricks Native Secrets Management

Databricks provides a built-in secrets management approach through "secret scopes", organizational containers for related credentials. Such Databricks-backed secrets are stored in encrypted databases managed by Databricks.

How It Works

Secret scopes organize credentials by environment, team, or application. Access control operates through three permission levels:

  • MANAGE (full control)
  • WRITE (read/write access)
  • READ (read-only access)

The platform automatically redacts secret values in notebook outputs, replacing them with [REDACTED] placeholders to prevent accidental exposure.

Developers create scopes through Databricks CLI:

databricks secrets create-scope <scope-name>

After, developers could use Databricks CLI or their SDKs to insert secrets:

databricks secrets put-secret --json '{
  "scope": "<scope-name>",
  "key": "<key-name>",
  "string_value": "<secret>"
}'

These secrets can then be later read by using the get-secret function by invoking a bash script with the Databricks CLI:

databricks secrets get-secret <scope-name> <key-name> | jq -r .value | base64 --decode

The Critical Limitations

While functional for basic use cases, Databricks' native secrets management has significant limitations that can create security gaps:

  • Workspace Isolation: Secrets cannot be shared across different Databricks workspaces, forcing credential duplication and increasing management overhead for multi-workspace deployments.
  • No Automatic Rotation: The platform lacks native secret rotation capabilities, requiring manual processes that are often delayed or forgotten, posing a major security risk.
  • Storage Limitations: Organizations are limited to 100 secret scopes per workspace and 1,000 secrets per scope, with individual secrets capped at 128 KB.
  • Limited Audit Capabilities: Basic logging compared to enterprise-grade secrets management solutions, making compliance and security monitoring challenging.
  • Cloud Vendor Lock-in: Azure Key Vault integration creates a dependency on Microsoft's ecosystem, which is problematic for multi-cloud strategies.

Infisical: The Developer-First Solution

For organizations serious about data security, several alternatives address Databricks' limitations while providing enterprise-grade capabilities.

Infisical emerges as a compelling modern alternative that directly addresses Databricks' limitations while prioritizing developer experience:

  • Cloud-Agnostic Flexibility: Supports self-hosted and cloud deployments across any infrastructure, eliminating vendor lock-in concerns that plague cloud-native solutions.
  • Developer-Centric Design: Built with developer workflows in mind, Infisical provides intuitive interfaces and seamless integration without requiring extensive security expertise, addressing one of HashiCorp Vault's biggest barriers to adoption.
  • Open Source Foundation: MIT licensing ensures transparency and eliminates vendor lock-in, while offering both self-hosted and managed service options to fit different organizational needs.
  • **Enterprise Security:** Comprehensive audit logging, automatic rotation, fine-grained access controls, and compliance-ready features that surpass Databricks' native capabilities.

To set up Infisical for Databricks, developers just need to use Infisical’s dedicated Databricks integration, which can be accessed from the product’s Integrations page.

Setting up the Databricks integration

Afterwards, developers need to configure the Databricks environment, path, and scope to integrate secrets with. Infisical will handle the rest (including secrets rotation) and start syncing secrets to the correct scopes in Databricks.

Implementation Best Practices

Immediate Security Improvements

  • Eliminate Hardcoded Secrets: Implement pre-commit hooks to detect and prevent credential commits. Use automated scanning tools to identify existing hardcoded credentials in repositories.
  • Implement Least Privilege Access: Grant only necessary permissions and use groups rather than individual assignments for scalability. Regular access reviews ensure permissions remain appropriate.
  • Establish Rotation Schedules: Implement 30-90 day rotation cycles for high-value credentials, with automated processes where possible to reduce manual overhead and human error.

Organization Strategies

  • Scope Organization: Create separate secret scopes for environments (dev/staging/prod), teams, and applications. Use consistent naming conventions that clearly indicate purpose and access level.
  • Environment Separation: Maintain strict separation between development, staging, and production credentials to prevent accidental exposure of production systems.
  • Monitoring and Alerting: Implement comprehensive logging for all secret operations and establish alerting for unusual access patterns.

Compliance Considerations

Regulatory requirements like SOC 2, GDPR, and HIPAA mandate specific controls, including encryption at rest and in transit, comprehensive audit trails, and documented access controls. Choose solutions that provide compliance-ready features rather than building custom compliance capabilities.

Making the Right Choice for Your Organization

Strategic Recommendations

As data and AI engineering mature in 2025, organizations should:

  1. Start with Security Assessment: Inventory existing credentials and identify security gaps in current processes
  2. Implement Quick Wins: Eliminate hardcoded secrets and establish basic access controls immediately
  3. Plan for Scale: Choose solutions that grow with your organization rather than requiring migration as needs evolve
  4. Prioritize Developer Experience: Select tools that enhance rather than hinder development workflows

With cybersecurity spending increasing 12.2% in 2025 and the MLOps market growing rapidly, investing in robust secrets management isn't optional; it's essential for protecting your data infrastructure and maintaining a competitive advantage.

Taking Action

The key to successful secrets management in Databricks lies in recognizing that native capabilities, while functional, often fall short of enterprise security requirements. Infisical, with its developer-first approach and cloud-agnostic flexibility, enables organizations to move beyond basic credential storage and implement true end-to-end secrets lifecycle management.

Start with an assessment of your current secrets management practices, eliminate immediate security risks, and adopt Infisical to strike the right balance between security requirements and operational efficiency. The cost of proactive secrets management pales in comparison to the potential impact of a credential-based breach in your data infrastructure.

As Databricks workloads grow in complexity, organizations that integrate Infisical today will be positioned for secure, scalable growth in the years ahead.

avatar

Mathew Pregasen

Technical Writer

Starting with Infisical is simple, fast, and free.
Full Infisical Logo

PRODUCT

Secret Management

Secret Scanning

Share Secret

Pricing

Security

RESOURCES

Blog

Infisical vs Vault

Careers

Hiring

Forum

Open Source Friends

Customers

Company Handbook

Trust Center

LEGAL

Terms of Service

Privacy Policy

Subprocessors

Service Level Agreement

CONTACT

Team Email

Sales

Support