Notebook Widgets: Making Databricks Notebooks Feel More Like Parameterized Queries

SQL Server DBAs know parameterized stored procedures. You pass in a date range, an environment name, a customer ID, and the procedure does the right thing based on those inputs. Logic lives in code, inputs come from outside. It's a clean pattern.

Notebooks don't work that way by default. A notebook is a script, and scripts hardcode things. The Databricks answer to parameterization is widgets — and they're worth understanding before you find yourself maintaining ten copies of the same notebook with different dates pasted into cell 3.

Widget Types

# Free-text input with a default value
dbutils.widgets.text("start_date", "2019-01-01", "Start Date")

# Dropdown with a fixed list of options
dbutils.widgets.dropdown("environment", "dev", ["dev", "staging", "prod"], "Environment")

# Combobox: dropdown where you can also type a custom value
dbutils.widgets.combobox("table_name", "orders", ["orders", "customers", "products"])

# Multi-select dropdown
dbutils.widgets.multiselect("regions", "West", ["West", "East", "Central", "South"])

In a notebook, widgets render as an input bar at the top of the page. Changing a widget value in the UI doesn't automatically re-run cells — you have to run the cell that reads it.

Reading Widget Values

start_date = dbutils.widgets.get("start_date")
env = dbutils.widgets.get("environment")

print(f"Running for {env} from {start_date}")

Widget values are always strings. If you need a date object or an integer, convert explicitly:

from datetime import date

start_dt = date.fromisoformat(dbutils.widgets.get("start_date"))
lookback_days = int(dbutils.widgets.get("lookback_days"))

Widgets and Databricks Jobs

This is where widgets become genuinely useful. When you run a notebook as a Databricks job, job parameters override widget defaults:

# In the job configuration JSON
{
  "notebook_task": {
    "notebook_path": "/pipelines/process_orders",
    "base_parameters": {
      "start_date": "2019-02-01",
      "environment": "prod"
    }
  }
}

The notebook's widget defaults are what it uses in interactive mode. The job's base_parameters are what it uses when running as a job. Same notebook, different behavior — no hardcoded values, no editing the notebook per environment.

Resetting Widget State

Widget state persists for the duration of a notebook session. If you re-run a notebook interactively with different parameters, the previous widget values are still live. Add this at the top of notebooks that get re-run frequently:

# Remove all widgets to ensure clean state
dbutils.widgets.removeAll()

# Recreate with current defaults
dbutils.widgets.text("start_date", "2019-01-01")

The Limitation: Strings Only

Widgets only pass strings. You can't pass a list, a dict, or a complex object. The workaround for lists is JSON:

import json

# Caller passes: '["orders", "customers", "products"]'
raw = dbutils.widgets.get("tables_to_process")
tables = json.loads(raw)

for table in tables:
    process_table(spark, table)

It's a little clunky, but it works and it's explicit. The alternative — maintaining separate notebooks per use case — is worse. One notebook, parameterized correctly, beats five notebooks with hardcoded values every time. As always, I'm here to help.

Read more