Delta Live Tables in Production: What Changes When You Let Databricks Own the Graph

Shannon Lowder

06 May 2022 — 2 min read

Delta Live Tables went GA in February. I've been running it in production for a couple of months now, and the experience is different from the preview. The framework has stabilized, the documentation has caught up to reality, and I have enough production hours on it to tell you where it actually shines and where you'll hit friction.

What GA Brought

The major additions since preview: enhanced autoscaling (DLT can now scale clusters up and down based on pipeline load, not just the fixed cluster size you configure), expectations tracking in the pipeline UI with historical pass/fail rates per rule, and better error messages when a pipeline fails mid-run. The debugging experience in preview was rough; it's meaningfully better now.

The Execution Model in Production

DLT pipelines run in their own compute environment — a DLT cluster, separate from your interactive and job clusters. You configure the cluster at the pipeline level, and DLT manages startup, termination, and recovery. The pipeline itself is stateful between runs: DLT maintains checkpoint state for streaming tables, and it tracks which live tables need to be refreshed based on the dependencies declared in your code.

import dlt
from pyspark.sql.functions import col, expr, current_timestamp

# Streaming source — bronze layer
@dlt.table(
name="orders_bronze",
table_properties={
"delta.autoOptimize.optimizeWrite": "true",
"pipelines.reset.allowed": "false" # Prevent accidental full reprocess
}
)
def orders_bronze():
return (
spark.readStream
.format("cloudFiles")
.option("cloudFiles.format", "json")
.option("cloudFiles.schemaLocation", "/mnt/checkpoints/orders/schema")
.load("/mnt/raw/orders/")
)

# Silver — with quality expectations
@dlt.table(name="orders_silver")
@dlt.expect_or_drop("order_id_not_null", "order_id IS NOT NULL")
@dlt.expect_or_drop("positive_amount", "order_amount > 0")
@dlt.expect("known_status", "status IN ('PENDING', 'CONFIRMED', 'SHIPPED', 'CANCELLED')")
def orders_silver():
return (
dlt.read_stream("orders_bronze")
.withColumn("order_date", expr("CAST(order_date_str AS DATE)"))
.withColumn("order_amount", col("order_amount_str").cast("decimal(18,2)"))
.withColumn("_processed_at", current_timestamp())
.drop("order_date_str", "order_amount_str")
)

Expectations in Practice

The three expectation modes cover the main cases:

@dlt.expect — log violations, don't stop the pipeline, don't drop records. Use for warnings.
@dlt.expect_or_drop — drop violating records before they land in the table. Use for rows that would corrupt downstream.
@dlt.expect_or_fail — halt the pipeline entirely if the expectation fails. Use for critical invariants (wrong source format, completely unexpected schema).

@dlt.table(name="orders_silver_strict")
@dlt.expect_or_fail(
"schema_version_check",
"schema_version = '2.0'" # Halt if we receive old-schema messages
)
@dlt.expect_or_drop("valid_customer_id", "LENGTH(customer_id) = 10")
@dlt.expect("referential_integrity_warning",
"customer_id IN (SELECT customer_id FROM LIVE.dim_customers)")
def orders_silver_strict():
return dlt.read_stream("orders_bronze")

The Live Table Reference Pattern

Within a DLT pipeline, use LIVE. to reference other tables in the same pipeline. This is what DLT uses to build the dependency DAG — if you use a regular table reference instead, DLT won't know about the dependency and won't guarantee execution order.

@dlt.table(name="order_customer_joined")
def order_customer_joined():
orders = dlt.read("orders_silver")
customers = dlt.read("dim_customers")
return orders.join(customers, "customer_id")

Where DLT Still Has Gaps

Local testing is still the main pain point. The dlt module is not available outside a running DLT pipeline context, which means you can't unit test your transformation functions without either mocking the DLT API or restructuring your code to separate the pure transformation logic from the DLT decorators. I do the latter — functions that take DataFrames and return DataFrames, tested independently, then wrapped with decorators for DLT. It works but it requires discipline. As always, I'm here to help.

Delta Live Tables in Production: What Changes When You Let Databricks Own the Graph

Shannon Lowder

What GA Brought

The Execution Model in Production

Expectations in Practice

The Live Table Reference Pattern

Where DLT Still Has Gaps

Read more

The Context Problem Neither Agent Mesh Nor OpenSharing Solves

Unity AI Gateway and What a Governed Model Access Layer Actually Buys You

You Don't Need Fable. You Need a Router.

DAIS 2026: Genie One and the Context Problem Databricks Is Solving