The number one question I get from engineers new to ADF is some variation of: "I set up the pipeline, it says it's running, but no data moved. What's happening?" Nine times out of ten, the answer involves the slice model. Nine times out of ten, the engineer had no idea the slice model existed until that moment.
The ADF v1 scheduling model is genuinely different from anything most data engineers have worked with before. It's not SQL Agent. It's not a cron job. It's not "run this pipeline at this time." Understanding it properly takes about thirty minutes. Not understanding it properly costs you hours of debugging and a persistent vague discomfort with a tool you think you should have figured out by now.
Let's dig into this.
The Core Concept: Slices
In ADF v1, everything centers on datasets, and datasets produce or consume data in time slices. A dataset has an availability section that defines how frequently slices are generated:
"availability": {
"frequency": "Hour",
"interval": 1
}
This means the dataset generates one slice per hour. Each slice represents one hour's worth of data. The slice has a start time and an end time. A daily dataset has 24-hour slices. A 15-minute dataset has slices every quarter hour.
A pipeline's activity reads from input dataset slices and writes to output dataset slices. The activity runs when its input slices are ready. "Ready" means the input dataset has data available for that time window.
Simple, right? Here's where it gets interesting.
The Pipeline Active Period
Pipelines have a start date and end date (their "active period"). The pipeline only processes slices that fall within that active period. If your dataset generates hourly slices and your pipeline active period starts at 2016-01-01T00:00:00Z, ADF will try to process every hourly slice from that point forward.
This is a common trap: you set your pipeline active period to a date in the past while testing, and ADF immediately tries to run every slice from that date to now — potentially hundreds of slices queued up simultaneously. Your concurrency settings determine how many run in parallel. If concurrency is set to 1 (the default), those slices run one at a time, and you spend the next several hours wondering why your pipeline appears to be running but never makes forward progress.
"policy": {
"concurrency": 4,
"executionPriorityOrder": "NewestFirst",
"retry": 3,
"timeout": "01:00:00"
}
Set executionPriorityOrder to NewestFirst if you need recent slices processed before backfill slices. Set it to OldestFirst (the default) if your pipeline has dependencies that require historical data to be processed in order.
Retry Behavior When Slices Fail
When a slice fails, ADF marks it as Failed and schedules a retry. The number of retries and the interval between them are configured in the policy section. After exhausting retries, the slice stays in Failed state. Failed slices don't block subsequent slices from running (by default) — ADF queues slices independently.
There's also a longRetry setting for slices that fail due to external system unavailability:
"policy": {
"retry": 3,
"timeout": "01:00:00",
"longRetry": 1,
"longRetryInterval": "01:00:00"
}
longRetry and longRetryInterval control what happens after the normal retry count is exhausted. With the above: fail 3 times, then wait 1 hour, then try once more. Useful for pipelines that depend on source systems with known maintenance windows.
External Data Availability Policies
For source datasets you don't control — an FTP drop from a vendor, an external API, a file that a partner system deposits — you can set an external flag on the dataset and configure a dataDelay and maximumRetry:
"availability": {
"frequency": "Day",
"interval": 1
},
"external": true,
"policy": {
"externalData": {
"retryInterval": "00:01:00",
"retryTimeout": "00:10:00",
"maximumRetry": 3
}
}
ADF polls the external dataset to check for data availability. When the data arrives, the slice becomes Ready and the pipeline runs. This is how you handle "run when the file shows up" without polling in your own code.
A Concrete Scenario
Daily ingest pipeline that reads from a vendor FTP server. The vendor deposits a file by 6am each morning. Sometimes the vendor is late. Occasionally the vendor misses a day entirely.
Configuration: source dataset is external with daily frequency. Pipeline active period starts at a fixed historical date. Policy has retry 3, timeout 2 hours, longRetry 1, longRetryInterval 4 hours.
Normal day: file arrives at 5:45am, ADF detects it, pipeline runs, slice completes by 6:30am.
Late file day: file doesn't arrive until noon. ADF checks for the slice on schedule, doesn't find data, marks the slice as waiting. Retries three times over 10 minutes. Falls into long retry — waits 4 hours. File arrives at noon. Next check finds it. Slice runs. Completes by 12:30pm.
Missed day: vendor doesn't deposit the file at all. ADF retries according to policy, exhausts retries, marks slice as Failed. The next day's slice runs normally — ADF doesn't block day 2 because day 1 failed. You have a failed slice in the monitoring view that you need to handle manually when the vendor eventually provides the missing data.
This behavior is configurable. If you need day 2 to wait for day 1 (sequential dependency), you set waitOnExternal on the output dataset. Most of the time you don't want this — you want forward progress even when historical slices are outstanding.
Comparing to What's Coming in v2
ADF v2 replaces the slice model with explicit triggers. A schedule trigger says "run this pipeline at 6am every day." A tumbling window trigger says "run this pipeline for each 1-hour window, carrying the window start and end times as parameters." An event trigger says "run this pipeline when a blob lands in this storage container."
The v2 model is more explicit and easier to explain to people who've never used ADF before. The v1 slice model is more powerful for certain dependency patterns — it naturally handles the "run when input is ready, regardless of when that is" case without custom code. Whether v2 triggers cover all those cases gracefully, I'll find out as I get more time with the preview.
For now: if you're running v1 and the scheduling behavior confuses you, internalize the slice model. It's worth the thirty minutes. Once it clicks, it's actually elegant. It just doesn't click without explanation. If you're starting fresh with ADF, start with v2 — the trigger model is more intuitive for most workloads. I'm here to help.