ForEach is here.
I've been waiting for this activity since I first tried to build a metadata-driven framework in ADF in 2015. Parameterization arrived in v2 last year and gave us half the framework. ForEach gives us the other half: the ability to iterate over a set of items and execute inner activities once per item. With Lookup + ForEach + parameterized inner pipelines, you can build a proper metadata-driven ingest framework entirely within ADF v2.
Let me walk through exactly how it works and what the complete framework looks like.
What ForEach Does
The ForEach activity takes an array of items and executes a set of inner activities for each item. The array comes from a pipeline parameter, from a Lookup Activity output, or from an expression that produces an array. Inner activities reference the current item via @item().
{
"name": "ForEachTable",
"type": "ForEach",
"dependsOn": [{ "activity": "LookupConfigTable", "dependencyConditions": ["Succeeded"] }],
"typeProperties": {
"items": {
"value": "@activity('LookupConfigTable').output.value",
"type": "Expression"
},
"isSequential": false,
"batchCount": 10,
"activities": [
{
"name": "CopyTableData",
"type": "Copy",
"typeProperties": {
"source": {
"type": "SqlSource",
"sqlReaderQuery": {
"value": "@concat('SELECT * FROM [dbo].[', item().SourceTable, '] WHERE [', item().WatermarkColumn, '] > ''', item().LastWatermark, '''')",
"type": "Expression"
}
},
"sink": {
"type": "AzureDataLakeStoreSink",
"folderPath": {
"value": "@concat('raw/', item().SourceSystem, '/', formatDateTime(utcnow(), 'yyyy/MM/dd'))",
"type": "Expression"
},
"fileName": {
"value": "@concat(item().SourceTable, '_', formatDateTime(utcnow(), 'yyyyMMddHHmm'), '.parquet')",
"type": "Expression"
}
}
}
}
]
}
}
The Lookup Activity that feeds ForEach reads the config table and returns all rows as an array:
{
"name": "LookupConfigTable",
"type": "Lookup",
"typeProperties": {
"source": {
"type": "SqlSource",
"sqlReaderQuery": "SELECT SourceSystem, SourceTable, WatermarkColumn, LastWatermark FROM dbo.IngestConfig WHERE IsActive = 1"
},
"firstRowOnly": false
}
}
Note "firstRowOnly": false — that's required to get all rows back as an array rather than just the first row. Easy to miss, common mistake.
The Complete Framework
Here's the full metadata-driven ingest pipeline with all activities wired together:
Activity 1 — LookupConfigTable: reads all active rows from the config table. Returns an array of objects with SourceSystem, SourceTable, WatermarkColumn, LastWatermark fields.
Activity 2 — ForEachTable: iterates over the config rows. For each row, executes Activities 2a and 2b in sequence.
Activity 2a — CopyData: copies incremental data from the source table to ADLS using expressions built from @item() fields. Source query filters by watermark column and last watermark value. Sink path is partitioned by date.
Activity 2b — UpdateWatermark: on success of CopyData, updates the LastWatermark value in the config table for this row. The new watermark value comes from @activity('CopyData').output.rowsCopied or from the window end time depending on whether you're using row-count or timestamp watermarking.
{
"name": "UpdateWatermark",
"type": "SqlServerStoredProcedure",
"dependsOn": [{ "activity": "CopyData", "dependencyConditions": ["Succeeded"] }],
"typeProperties": {
"storedProcedureName": "usp_UpdateIngestWatermark",
"storedProcedureParameters": {
"SourceTable": { "value": "@item().SourceTable", "type": "Expression" },
"NewWatermark": { "value": "@formatDateTime(trigger().scheduledTime, 'yyyy-MM-dd HH:mm:ss')", "type": "Expression" }
}
}
}
The Production Constraints You Need to Know
Default parallel execution limit is 20. ForEach runs inner activities in parallel by default, up to batchCount. The maximum batchCount is 50. If your config table has 200 tables, ForEach runs 50 at a time, then the next 50, then the next 50, then the last 50. This is a hard limit — you can't set batchCount above 50. If you need more than 50 concurrent executions, you need to split your workload across multiple ForEach activities or multiple pipelines.
No nested ForEach. You cannot put a ForEach inside a ForEach. If your framework has a two-level hierarchy (databases → tables), you need to flatten it before passing to ForEach, or use Execute Pipeline activity inside ForEach to call a child pipeline that contains its own ForEach.
Error behavior is configurable. By default, if one ForEach item fails, the other items continue executing. The failed item's activities report failure; the successful items' activities report success. The ForEach activity itself reports failure if any item fails. This is the right behavior for most ingest scenarios — you don't want one table's failure to block the remaining 199 tables.
If you need all-or-nothing behavior (fail the entire ForEach if any item fails), you can check the ForEach output in a subsequent activity and handle it there. But for ingest pipelines, continue-on-item-failure is almost always what you want.
The Framework I've Been Waiting to Build
In 2015 I wrote about trying to build a metadata-driven framework in ADF v1 and hitting the parameterization wall. In late 2016 I wrote about parameterization arriving in v2 but ForEach being missing. Now ForEach is here, and the framework I wanted to build three years ago is fully expressible within ADF v2 without workarounds, without stored procedure loops, without external API calls.
One pipeline. A config table with one row per source table. Lookup reads the config. ForEach drives the copy and watermark update for each table. New table? Add a row to the config table. Remove a table? Set IsActive = 0. Schema change? Update the config row. The pipeline doesn't change.
Is it perfect? No. There's still no git integration. The batchCount limit of 50 is an artificial constraint that I'd rather not have. The monitoring of individual ForEach items requires digging into the ForEach activity detail. But the core pattern works, it's maintainable, and it's the right architecture for this problem.
Three years. We got there. If you're building out this pattern and want to compare notes on the config table schema, watermark strategy, or error handling design, I'm here to help.