The Databricks CLI for Platform Teams: Automating Workspace Governance

Every Databricks workspace starts with the same problem: no guardrails. Developers can create 20-node clusters, leave them running indefinitely, install whatever libraries they want, and access any data they have network access to. The first cloud bill after a team gets comfortable is usually the motivation for adding structure. The CLI is how you add that structure programmatically rather than clicking through the UI every time something needs to change.

What the CLI Is Good For in a Platform Context

If you're a data engineer or platform team member responsible for a shared Databricks workspace, the CLI is the tool for:

  • Deploying cluster policies across environments (dev, staging, prod)
  • Creating and maintaining secret scopes as part of environment setup
  • Scripting notebook deployments for CI/CD pipelines
  • Automating user provisioning and group management
  • Auditing what's running in the workspace without going through the UI

Deploying Cluster Policies Across Environments

# Create a cluster policy from a JSON file
databricks cluster-policies create --json-file ./policies/data-engineering-policy.json

# List existing policies
databricks cluster-policies list

# Update an existing policy
databricks cluster-policies edit --json @./policies/data-engineering-policy.json

The pattern: keep your cluster policies in version control as JSON files. Your CI pipeline deploys them on merge. Dev, staging, and prod environments get the same policies, with environment-specific overrides for things like max cluster size.

Workspace Audit: What's Running and What Isn't

# List all clusters and their state
databricks clusters list | python3 -c "
import json, sys
clusters = json.load(sys.stdin)['clusters']
running = [c for c in clusters if c.get('state') == 'RUNNING']
print(f'Running clusters: {len(running)}')
for c in running:
    node_type = c.get('node_type_id', 'unknown')
    num_workers = c.get('num_workers', 0)
    print(f'  {c[\"cluster_name\"]}: {num_workers} {node_type} workers')
"
# Find clusters without auto-termination set
databricks clusters list | python3 -c "
import json, sys
clusters = json.load(sys.stdin)['clusters']
no_autoterminate = [c for c in clusters
                    if not c.get('autotermination_minutes')]
print(f'Clusters without auto-termination: {len(no_autoterminate)}')
for c in no_autoterminate:
    print(f'  {c[\"cluster_name\"]} ({c[\"state\"]})')
"

Bulk Secret Scope Setup

When setting up a new environment, automate the secret scope creation rather than clicking through the UI:

#!/bin/bash
# setup-environment.sh

ENVIRONMENT=$1  # dev, staging, prod

# Create secret scope for this environment
databricks secrets create-scope --scope "myproject-${ENVIRONMENT}"

# Grant access to the appropriate group
databricks secrets put-acl \
  --scope "myproject-${ENVIRONMENT}" \
  --principal "data-engineering-${ENVIRONMENT}" \
  --permission READ

echo "Secret scope myproject-${ENVIRONMENT} created"
echo "Add secrets with: databricks secrets put --scope myproject-${ENVIRONMENT} --key "

The manual step (adding actual secret values) stays manual — you shouldn't store secret values in scripts. But the structural setup (scope creation, ACLs, policies) is scriptable and should be in your infrastructure-as-code repository. As always, I'm here to help.

Read more