Setting Up Your First ADF Environment: What the Portal Expects From You

The ADF portal makes setup look deceptively simple. Create a data factory, click around, start building pipelines. What it doesn't tell you upfront: there are prerequisites, gotchas with on-premises connectivity, and a browser-based editor that will cost you work if you don't build a local JSON workflow immediately. Let's do this right the first time.

Azure Prerequisites

Before you touch the portal, make sure you have these:

Azure subscription with at least Contributor role on the resource group where you'll create the factory. Reader won't cut it — ADF needs to create and manage resources.
Storage account for staging and script storage. Standard LRS in the same region as your factory. ADF and your storage account in different regions means cross-region data transfer costs.
Resource provider registration: In your subscription, navigate to Resource Providers and confirm Microsoft.DataFactory is registered. New subscriptions may not have it pre-registered. This is the first thing to check when the portal gives you a cryptic "resource not found" error on factory creation.

Creating the Data Factory

Portal path: New > Data + Analytics > Data Factory. Name it something environment-specific — myproject-dev-adf rather than myproject-adf. You cannot rename a data factory after creation, and you will want separate dev/prod factories. Region matters for data residency and latency — pick the region where your data sources live.

After creation, you land on the factory's Overview blade. The Author and Deploy section is where all the JSON editing happens. Resist the urge to start clicking — read the rest of this post first.

Your First Linked Service

Azure Blob Storage is the easiest starting point. In Author and Deploy, click New data store > Azure Storage. The portal generates a JSON template. Fill in your account name and key:

{
  "name": "AzureStorageLinkedService",
  "properties": {
    "type": "AzureStorage",
    "description": "Primary storage account for pipeline staging",
    "typeProperties": {
      "connectionString": "DefaultEndpointsProtocol=https;AccountName=mystorageacct;AccountKey=YOUR_KEY_HERE"
    }
  }
}

Click Deploy. The linked service is now live in your factory. No confirmation step, no staging environment — it goes straight to production. This is the first sign that the portal's authoring model is not designed for teams.

Data Management Gateway: On-Premises Connectivity

If any of your sources are on-premises (SQL Server, Oracle, file shares), you need the Data Management Gateway. This is a lightweight Windows agent that installs on a machine in your network and creates an outbound HTTPS tunnel to ADF. No inbound firewall rules required — the gateway initiates the connection.

Installation steps:

In the portal, go to Author and Deploy > New data store > select an on-premises type (e.g., On-Premises SQL Server)
ADF prompts you to create a gateway. Name it something meaningful: prod-gateway-01
Copy the registration key from the portal
Download the gateway installer from Microsoft's download center and run it on your gateway machine
During installation, paste the registration key when prompted
The gateway registers with ADF and shows as Connected in the portal within a minute or two

Gateway machine requirements: Windows Server 2008 R2 or later, .NET 4.5+, 2 GB RAM minimum (4 GB recommended for production), outbound HTTPS (port 443) to *.servicebus.windows.net and *.core.windows.net. If your network has a proxy, configure it in the gateway manager before registering.

The Browser Editor Problem

Here is the honest assessment of ADF's browser-based JSON editor: it is a trap for anyone planning to run ADF in production beyond a proof of concept.

There is no autosave. Close the tab while editing and your changes are gone. There is no version history — deploy a change and the previous version is overwritten silently. There is no diff — you cannot see what changed between two deploys. There is no environment promotion — changes made in the portal go directly to the factory, dev or prod.

My recommendation from day one: treat the portal as a read-only monitoring interface. Do all JSON authoring locally.

The workflow that actually works:

Create a git repository (local or hosted) for your factory's JSON files
Write and edit JSON in VS Code or any text editor with JSON schema support
Deploy via PowerShell using the AzureRM.DataFactory module (or Azure CLI)
Use the portal only for monitoring pipeline runs and diagnosing failures

# Deploy a linked service via PowerShell
$ResourceGroup = "myproject-dev-rg"
$DataFactoryName = "myproject-dev-adf"

New-AzureRmDataFactoryLinkedService `
  -ResourceGroupName $ResourceGroup `
  -DataFactoryName $DataFactoryName `
  -File ".\linkedservices\AzureStorageLinkedService.json" `
  -Force

The -Force flag overwrites an existing linked service with the same name. Without it, a deploy to an existing name fails. Use -Force deliberately and understand it replaces without confirmation.

Environment Separation

Create separate data factories for dev and prod. The JSON files are identical except for the linked service definitions (dev storage account vs. prod storage account, dev database vs. prod database). Keep linked service JSON files in environment-specific folders and parameterize the account names and keys at deploy time via PowerShell variables or a config file that is not committed to git.

This pattern — JSON in git, secrets injected at deploy time — is the foundation you need before ADF gets its own secrets management story. Right now there is no Key Vault integration. Connection strings go in the JSON. Keep them out of source control.

Next post: a deep dive on linked services and the connector landscape. If you hit a wall during setup, I'm here to help.

Setting Up Your First ADF Environment: What the Portal Expects From You

Shannon Lowder

Azure Prerequisites

Creating the Data Factory

Your First Linked Service

Data Management Gateway: On-Premises Connectivity

The Browser Editor Problem

Environment Separation

Read more

The Context Problem Neither Agent Mesh Nor OpenSharing Solves

Unity AI Gateway and What a Governed Model Access Layer Actually Buys You

You Don't Need Fable. You Need a Router.

DAIS 2026: Genie One and the Context Problem Databricks Is Solving