ADF has 90+ connectors. Ninety connectors and you still occasionally hit a source that isn't covered, an API with authentication too complex for the HTTP connector, or transformation logic that doesn't fit any built-in activity. For those cases, ADF provides the Custom Activity.
The Custom Activity runs arbitrary code on Azure Batch. It's the escape hatch of last resort. Here's how it works, when to use it, and when to use something else instead.
How the Custom Activity Works
The Custom Activity requires an Azure Batch account. ADF sends the activity execution to Batch, which provisions a pool of VMs (or uses an existing pool), runs your code on those VMs, and returns the results. You pay for the Batch VMs while they're running.
Your code is a .NET assembly. Specifically, a class that implements the IDotNetActivity interface (v1) or, in v2, a .NET executable packaged as a zip file. The code receives a dictionary of extended properties (parameters you pass from the pipeline), has access to any linked service credentials the pipeline makes available, and writes output to a location that subsequent activities can reference.
// Custom activity entry point structure for ADF v2
public class Program
{
static void Main(string[] args)
{
// Extended properties come in via a file at the path specified in args
var activityInput = JsonConvert.DeserializeObject<ActivityInput>(
File.ReadAllText(args[0])
);
string apiEndpoint = activityInput.ExtendedProperties["ApiEndpoint"];
string apiKey = activityInput.ExtendedProperties["ApiKey"];
string targetBlobPath = activityInput.ExtendedProperties["TargetBlobPath"];
// Do the work
var data = CallComplexApi(apiEndpoint, apiKey);
WriteToBlob(data, targetBlobPath);
// Output for subsequent activities
var output = new { RowCount = data.Count, FilePath = targetBlobPath };
File.WriteAllText("output.json", JsonConvert.SerializeObject(output));
}
}
Package the compiled assembly and any dependencies as a zip file, upload to Azure Blob Storage, reference it in the Custom Activity definition.
{
"name": "CallComplexAPI",
"type": "Custom",
"linkedServiceName": {
"referenceName": "AzureBatchLinkedService",
"type": "LinkedServiceReference"
},
"typeProperties": {
"command": "dotnet MyCustomActivity.dll",
"resourceLinkedService": {
"referenceName": "StorageLinkedService",
"type": "LinkedServiceReference"
},
"folderPath": "custom-activities/complex-api-activity",
"extendedProperties": {
"ApiEndpoint": "@pipeline().parameters.ApiEndpoint",
"ApiKey": "@pipeline().parameters.ApiKey",
"TargetBlobPath": {
"value": "@concat('/landing/api-data/', formatDateTime(utcnow(), 'yyyyMMdd'), '/output.json')",
"type": "Expression"
}
}
}
}
Real Use Cases
OAuth 2.0 APIs with token refresh: The ADF HTTP connector handles simple bearer token auth. It doesn't handle OAuth flows that require token refresh. A Custom Activity can manage the full OAuth lifecycle — request a token, cache it, refresh when expired, make the API calls. I've built this for a client integrating with a SaaS platform that has an unusual OAuth implementation.
REST APIs with cursor-based pagination: Many APIs return paginated results with a cursor or next-page token in the response. The ADF HTTP connector doesn't loop through pages. A Custom Activity can handle the pagination loop, accumulate results, and write the complete dataset to Blob.
Custom transformation logic: Occasionally there's transformation logic that doesn't fit a stored procedure (too complex, requires external library calls) and doesn't fit a Databricks notebook (team doesn't have Databricks). A Custom Activity running C# or Python (via command) can handle it. Note: for Python, you're running a command on the Batch VM, not a first-class Python activity. The Batch VM needs Python installed, which requires a custom VM image or a startup task.
The Ceremony Cost
I want to be honest about how much overhead the Custom Activity adds. Compared to a built-in activity:
- You need an Azure Batch account — another resource to provision, manage, and monitor
- You need a storage account location for your activity zip file
- Build, package, and deploy the zip file as part of your CI/CD pipeline
- The Batch VM pool start time adds latency (1-3 minutes to provision a pool from scratch, seconds if using a warm pool)
- Debugging is harder — logs go to Batch task output, not ADF monitoring
- Cost: you're paying for Batch VMs on top of ADF activity execution costs
For a handful of lines of code — a simple REST call with custom headers, a one-off transformation — that's a lot of ceremony. For a complex API integration or a substantial transformation, it's justified.
When to Use Something Else
Azure Functions: For simple code that runs fast (seconds to minutes), an Azure Function is much lighter than Custom Activity. ADF v2 has a native Azure Function activity — pass parameters, call the function, get the response. No Batch account, no VM provisioning latency, consumption pricing. If the code runs in under 10 minutes and doesn't need the Batch VM environment, use Azure Functions.
Databricks notebook: ADF v2 has a Databricks notebook activity in preview. For Python-heavy processing, complex transformations, or work that benefits from Databricks' distributed compute, trigger a Databricks notebook from ADF rather than running Python in a Batch VM. Better tooling, better debugging, better scalability.
Stored Procedure Activity: For transformation logic that lives naturally in SQL Server, the stored procedure activity is simpler, faster, and cheaper than Custom Activity. If the logic can be expressed in T-SQL, put it in a stored procedure.
The Rule
Use Custom Activity when: the logic is genuinely complex (multi-step API interaction, stateful processing, external library requirements), the code runs long enough to justify Batch VM provisioning overhead, and none of the lighter alternatives (Azure Functions, Databricks, Stored Procedure) fit the scenario.
Use Custom Activity as a last resort, not a first tool. If you reach for it often, that's a signal that your pipeline logic is trying to do too much — consider whether the processing belongs in ADF at all or whether it belongs in a compute layer that ADF orchestrates. I'm here to help think through the architecture if you're evaluating whether Custom Activity is the right answer for your scenario.