Spark 3.0 Preview: What Adaptive Query Execution Changes for SQL Pros

Shannon Lowder

13 Mar 2020 — 2 min read

Spark 3.0 is in preview and the feature I'm watching most closely is Adaptive Query Execution. If it ships as documented, it addresses one of the most common performance frustrations in Spark 2.x: the query optimizer making bad choices because it doesn't know your data's actual distribution at plan time.

The Problem AQE Solves

In Spark 2.x, the query optimizer makes its decisions (join strategy, partition count, skew handling) based on statistics collected when the table was last analyzed. If you ran ANALYZE TABLE a month ago and your data has changed significantly since then, the optimizer is working from stale numbers.

The most common consequence: a shuffle operation produces 200 output partitions (the default) even though 195 of them have almost no data. Your query then runs 200 tasks when 5 would have been sufficient, wasting cluster time on empty work.

What AQE Changes

Adaptive Query Execution monitors actual shuffle output at runtime and adjusts the plan accordingly. Three specific behaviors:

Dynamic partition coalescing — after a shuffle, instead of running all 200 downstream tasks, AQE looks at the actual partition sizes and merges small partitions together. If 150 of your 200 partitions are under 1MB, AQE coalesces them into a smaller number of reasonably-sized tasks.

# With AQE enabled, this setting becomes a maximum, not a fixed value
spark.conf.set("spark.sql.shuffle.partitions", "200")
# AQE may coalesce those 200 down to 20 based on actual data size

Dynamic join strategy switching — the optimizer may plan a sort-merge join, but AQE can switch to a broadcast join at runtime if it discovers that one side of the join is actually small enough to broadcast. This means you get broadcast join performance without having to know your data size at plan time.

Automatic skew handling — if AQE detects that one partition in a join contains significantly more data than others (the classic data skew problem), it splits that partition and applies the join on the splits. The executor that would have been overwhelmed by the skewed partition now processes it in multiple smaller chunks.

How to Enable It

# Available in Spark 3.0 / Databricks Runtime 7.x
spark.conf.set("spark.sql.adaptive.enabled", "true")
spark.conf.set("spark.sql.adaptive.coalescePartitions.enabled", "true")
spark.conf.set("spark.sql.adaptive.skewJoin.enabled", "true")

Once Databricks Runtime 7.x ships with Spark 3.0, these will likely be defaults. For now in preview, you enable them explicitly per session or per cluster.

The tuning implication: with AQE, the specific value of spark.sql.shuffle.partitions matters less. You're setting a starting point and AQE adjusts from there. The recommendation is to set it higher than you think you need and let AQE coalesce down, rather than trying to predict the right value yourself. As always, I'm here to help.

The Context Problem Neither Agent Mesh Nor OpenSharing Solves

I wrote recently about Azure Agent Mesh and OpenSharing — two infrastructure layers that between them cover how enterprises register, discover, share, and execute agents. Between them, they address a lot of the plumbing that has been missing from the enterprise agent stack. But there's a gap neither of

Unity AI Gateway and What a Governed Model Access Layer Actually Buys You

Unity AI Gateway, announced at DAIS this week, is the feature I've been waiting for since Agent Bricks shipped last year. It's a centralized governance layer for model access in Databricks — you configure which models are approved for use in your environment, who can call them,

You Don't Need Fable. You Need a Router.

The performance gap between open-weight models and closed frontier models has spent the last year collapsing faster than anyone predicted. Epoch AI's tracking puts open weights at roughly a three-to-four-month lag behind state-of-the-art closed models on average. For coding tasks, the gap has effectively closed — DeepSeek V3.2

DAIS 2026: Genie One and the Context Problem Databricks Is Solving

The central message from DAIS this week, delivered by Ali Ghodsi in the opening keynote, was direct: AI doesn't have an intelligence problem, it has a context problem. If your CFO can't get an AI system to explain why margins changed, that's not a