Year Two on Databricks: What I Would Tell 2018 Me

Shannon Lowder

06 Dec 2019 — 2 min read

Two years ago I wrote a post about moving from SQL Server to Databricks — what skills transfer, where the mental model breaks down, what to learn first. Most of it still holds. But there are things I would change if I were starting that journey today, and things I wish I'd done differently in year one.

What I Got Right

Starting with SQL, not Python. Spark SQL is a real SQL dialect and it covers 80% of what you need for data engineering. The instinct to immediately learn PySpark DataFrames because that's what the documentation focuses on is wrong for SQL professionals. Start where you're strong. You can add Python idioms incrementally as you hit the walls of what SQL can express.

Learning partitioning early. The partitioning decision is probably the highest-leverage architectural choice in Databricks. Getting it right at the start of a project saves enormous pain later. I spent a lot of year one getting this right through trial and error; reading more carefully about partition pruning and data skipping up front would have saved months.

Using Delta from the start. I know some teams started with plain Parquet and planned to migrate to Delta later. That migration is annoying. Delta's overhead is negligible and the benefits — ACID writes, time travel, schema enforcement — are valuable from day one. Start with Delta.

What I'd Do Differently

Set up cluster policies earlier. We let developers create clusters without constraints for the first four months. Cloud bills were... higher than they needed to be. Policies are a 30-minute setup that pays for itself in the first billing cycle.

Build observability into pipelines from the start. We retrofitted MLflow tracking into pipelines that were already in production. It's harder than doing it at the start. Every pipeline should log row counts, run times, and data quality metrics from its first production run.

Establish the bronze-silver-gold pattern before letting teams build ad-hoc. We had six months of everyone writing data wherever made sense to them before we standardized on the medallion architecture. Migrating existing tables to the right layers is more work than building into the right structure from the beginning.

What's Changed Since Year One

The Delta Lake open-source release is the biggest shift. It went from "Databricks feature" to "industry standard format." The ecosystem is catching up fast — Hive, Presto, and other query engines are working on Delta compatibility. The format bet paid off.

Databricks itself has matured: better job scheduling, better cluster management, better monitoring. The rough edges from 2018 are smoother. The platform is worth betting on, but now with much clearer evidence than we had two years ago.

Year three starts here. The fundamentals are in place. What comes next is building on top of them. As always, I'm here to help.

The Context Problem Neither Agent Mesh Nor OpenSharing Solves

I wrote recently about Azure Agent Mesh and OpenSharing — two infrastructure layers that between them cover how enterprises register, discover, share, and execute agents. Between them, they address a lot of the plumbing that has been missing from the enterprise agent stack. But there's a gap neither of

Unity AI Gateway and What a Governed Model Access Layer Actually Buys You

Unity AI Gateway, announced at DAIS this week, is the feature I've been waiting for since Agent Bricks shipped last year. It's a centralized governance layer for model access in Databricks — you configure which models are approved for use in your environment, who can call them,

You Don't Need Fable. You Need a Router.

The performance gap between open-weight models and closed frontier models has spent the last year collapsing faster than anyone predicted. Epoch AI's tracking puts open weights at roughly a three-to-four-month lag behind state-of-the-art closed models on average. For coding tasks, the gap has effectively closed — DeepSeek V3.2

DAIS 2026: Genie One and the Context Problem Databricks Is Solving

The central message from DAIS this week, delivered by Ali Ghodsi in the opening keynote, was direct: AI doesn't have an intelligence problem, it has a context problem. If your CFO can't get an AI system to explain why margins changed, that's not a