Notebook Widgets: Making Databricks Notebooks Feel More Like Parameterized Queries
SQL Server DBAs know parameterized stored procedures. You pass in a date range, an environment name, a customer ID, and the procedure does the right thing based on those inputs. Logic lives in code, inputs come from outside. It's a clean pattern.
Notebooks don't work that way by default. A notebook is a script, and scripts hardcode things. The Databricks answer to parameterization is widgets — and they're worth understanding before you find yourself maintaining ten copies of the same notebook with different dates pasted into cell 3.
Widget Types
# Free-text input with a default value
dbutils.widgets.text("start_date", "2019-01-01", "Start Date")
# Dropdown with a fixed list of options
dbutils.widgets.dropdown("environment", "dev", ["dev", "staging", "prod"], "Environment")
# Combobox: dropdown where you can also type a custom value
dbutils.widgets.combobox("table_name", "orders", ["orders", "customers", "products"])
# Multi-select dropdown
dbutils.widgets.multiselect("regions", "West", ["West", "East", "Central", "South"])In a notebook, widgets render as an input bar at the top of the page. Changing a widget value in the UI doesn't automatically re-run cells — you have to run the cell that reads it.
Reading Widget Values
start_date = dbutils.widgets.get("start_date")
env = dbutils.widgets.get("environment")
print(f"Running for {env} from {start_date}")Widget values are always strings. If you need a date object or an integer, convert explicitly:
from datetime import date
start_dt = date.fromisoformat(dbutils.widgets.get("start_date"))
lookback_days = int(dbutils.widgets.get("lookback_days"))Widgets and Databricks Jobs
This is where widgets become genuinely useful. When you run a notebook as a Databricks job, job parameters override widget defaults:
# In the job configuration JSON
{
"notebook_task": {
"notebook_path": "/pipelines/process_orders",
"base_parameters": {
"start_date": "2019-02-01",
"environment": "prod"
}
}
}The notebook's widget defaults are what it uses in interactive mode. The job's base_parameters are what it uses when running as a job. Same notebook, different behavior — no hardcoded values, no editing the notebook per environment.
Resetting Widget State
Widget state persists for the duration of a notebook session. If you re-run a notebook interactively with different parameters, the previous widget values are still live. Add this at the top of notebooks that get re-run frequently:
# Remove all widgets to ensure clean state
dbutils.widgets.removeAll()
# Recreate with current defaults
dbutils.widgets.text("start_date", "2019-01-01")The Limitation: Strings Only
Widgets only pass strings. You can't pass a list, a dict, or a complex object. The workaround for lists is JSON:
import json
# Caller passes: '["orders", "customers", "products"]'
raw = dbutils.widgets.get("tables_to_process")
tables = json.loads(raw)
for table in tables:
process_table(spark, table)It's a little clunky, but it works and it's explicit. The alternative — maintaining separate notebooks per use case — is worse. One notebook, parameterized correctly, beats five notebooks with hardcoded values every time. As always, I'm here to help.