Adaptive Query Execution in Spark 3.0: What the Optimizer Finally Does Automatically
Spark 3.0 shipped in June, and Databricks Runtime 7.x brought it to production workspaces. The headlining feature — Adaptive Query Execution — has been running in my pipelines for a few months now. Here's what it actually changed versus what I expected.
What AQE Changed in Practice
The biggest visible difference: shuffle partition counts are no longer something I tune manually per query. Before Spark 3.0, I had a spreadsheet of optimal spark.sql.shuffle.partitions values for different pipeline stages — 50 for the daily aggregation, 200 for the large joins, 20 for the small dimension refreshes. With AQE enabled, I set 200 globally and let AQE coalesce down based on actual data sizes.
# Spark 3.0 + AQE: set these once per session
spark.conf.set("spark.sql.adaptive.enabled", "true")
spark.conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true")
spark.conf.set("spark.sql.adaptive.skewJoin.enabled", "true")
# Start high, let AQE reduce
spark.conf.set("spark.sql.shuffle.partitions", "200")For pipelines that process varying data volumes day-to-day (end-of-month loads are 5x normal), AQE adjusts automatically. Monday's job processes 10 partitions of real work; month-end processes all 200. Previously, I'd either over-provision for month-end and waste resources every other day, or tune for the average and run slow on month-end.
The Join Strategy Switching
AQE's dynamic join strategy selection changed one specific query I had been fighting for months. It was a star schema join — a large fact table joined against a dimension table that was "small-ish" (40MB). The Spark 2.x optimizer wouldn't broadcast it because it was over the 10MB threshold, even though 40MB is trivially broadcastable on a 4-worker cluster.
With AQE, Spark now checks actual post-shuffle sizes and switches to broadcast when appropriate, regardless of the pre-shuffle estimate. That query went from a 45-minute sort-merge join to a 6-minute broadcast join without any code changes.
Skew Handling
-- AQE detects and handles this automatically in Spark 3.0
-- No manual skew hints required (though they still work if you need them)
SELECT o.region, COUNT(*) as order_count
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
GROUP BY o.regionIf the "West" region has 10x the data of any other region, AQE detects the skewed partition after the shuffle and splits it for processing. In Spark 2.x, one executor would spend 10 minutes processing West while 19 others finished in 1 minute and sat idle. With AQE, the skewed partition is split and distributed.
What Didn't Change
AQE doesn't fix partitioning strategy. If your Delta table has the wrong partition column for your queries, AQE doesn't help — you still need to partition by something you filter on. Data skipping still requires Z-ordering or well-chosen partition columns. The physical data layout decisions are still yours to make.
Think of AQE as a smart runtime that handles the stuff you shouldn't have to tune manually. The stuff that requires understanding your data and your query patterns is still yours. As always, I'm here to help.