ADF and the Missing Git Story: Two Years of Waiting

Azure Data Factory has been generally available for over a year. It still has no git integration. I want to be precise about what that means, because "no git integration" sounds like a developer convenience feature. It isn't. It's a production operations problem.

What the Gap Actually Costs

Let me tell you about a specific incident from last November.

We had a production pipeline failing because a source table had a new column that was being rejected by the destination schema. Standard schema drift issue. Our on-call engineer logged into the ADF portal, found the affected dataset definition, updated the column mapping, and the pipeline resumed. Crisis averted at 11pm. Good work.

Three weeks later, we deployed a batch of new pipelines from our git repository. The deployment script uses -Force, which overwrites existing definitions. The fix that was applied at 11pm — the one that lived only in the portal and never got exported to git — was overwritten. The pipeline failed again. This time in the middle of a business day.

That's the gap. Not "it would be nice to see history in git." The gap is: changes made under pressure don't make it back to source, and the next deployment erases them.

The Discipline Tax

The workaround is discipline. Every change to a pipeline must be exported as JSON and committed to git before the session ends. We've added it to our runbooks, our on-call documentation, our new team member onboarding. It's a rule that requires human enforcement on every single change.

Rules that require human enforcement on every change fail at the rate humans fail under pressure. That rate is not zero.

The export process itself is manual: navigate to the pipeline in the portal, find the export button (it has moved twice in two years), download the JSON, commit it. For a dataset change you also export the dataset definition separately. For a linked service change, same. If your pipeline touches three datasets, that's four separate exports.

We've automated part of this with PowerShell scripts that pull all pipeline and dataset definitions via the ADF REST API and commit them to git on a schedule. It's better. It's still not the same as the tool knowing where its source lives.

Compare to What Else Exists

SSIS has been around since 2005. SSIS packages are files — .dtsx files — that live in a Visual Studio project, which lives in a folder, which you put in git from day one. Source control is not a feature of SSIS; it's a consequence of SSIS being file-based. There is no "SSIS portal" where you edit packages. You edit files. You commit files.

Databricks notebooks got git sync much earlier than ADF got anything. The Databricks workspace model has its problems, but at least notebooks can be exported as Python or SQL files and tracked in source control, and more recently Databricks added native git folder sync. A younger product with a more coherent source control story.

SQL Server Data Tools — Microsoft's own product — is built around the dacpac model where your database schema lives in a project in git and deployments are generated from that project. Microsoft knows how to do this. They chose not to do it for ADF.

What Microsoft Apparently Says

ADF v2, which is in preview, has git integration on the roadmap. I've seen this stated in blog posts and in responses to feedback on the Azure feedback portal, where the git integration request has been in the top ten most-voted items for two years.

Here's my position: "on the roadmap" has been the answer since 2014. The product shipped without it and has remained without it through general availability. At some point "on the roadmap" stops being a promise and starts being a pattern. The pattern is: this wasn't important enough to block GA, and it keeps not being important enough to prioritize over other features.

I hope v2 changes this. I'm not holding production pipeline changes waiting for it.

The Practical Advice

Until native git integration ships, here's what works:

Automate the export: use the ADF REST API to pull all pipeline, dataset, and linked service definitions on a schedule and commit to git. This is your safety net, not your workflow.
Enforce the rule in code review: any ADF change goes through a PR that includes the exported JSON update. No JSON, no approval.
Deploy from git, always: never make a portal change you haven't committed. If you made a portal change under pressure, export it before you close the browser tab.
Treat portal edits as temporary: the portal is for testing and debugging. Git is for production truth.

None of this is as good as the tool understanding where its source lives. But it's what we have. If you're running ADF in production and you haven't built this discipline yet, do it before the incident that makes you wish you had. I'm here to help.

ADF and the Missing Git Story: Two Years of Waiting

Shannon Lowder

What the Gap Actually Costs

The Discipline Tax

Compare to What Else Exists

What Microsoft Apparently Says

The Practical Advice

Read more

The Context Problem Neither Agent Mesh Nor OpenSharing Solves

Unity AI Gateway and What a Governed Model Access Layer Actually Buys You

You Don't Need Fable. You Need a Router.

DAIS 2026: Genie One and the Context Problem Databricks Is Solving