ADF Monitoring in 2017: What Improved and What Didn't

ADF monitoring has been a persistent complaint of mine since 2014. The v1 monitoring view showed slice status — Ready, Failed, In Progress — and not much else. When a pipeline failed, you got an error message. When a pipeline was slow, you got nothing useful. You learned to tolerate that.

ADF v2 has made genuine improvements. It's also still missing things it shouldn't be missing in 2017. Let me give you the accurate picture.

What Improved in v2

Activity Run Input and Output JSON

This is the biggest improvement and it's genuinely useful. In v2, every activity run in the monitoring view shows the input JSON (what parameters and configuration the activity received) and the output JSON (what the activity produced). For debugging parameterized pipelines, this is essential.

When a pipeline fails and you need to know what parameter values were passed, you click the activity run, expand the input, and see the exact SQL query that was built from your expressions, the exact file path that was constructed, the exact parameter values that flowed in. In v1, you were guessing. In v2, you have the evidence.

// Example activity run output JSON
{
  "dataRead": 1073741824,
  "dataWritten": 524288000,
  "rowsCopied": 2847291,
  "copyDuration": 187,
  "throughput": 5737.68,
  "errors": [],
  "effectiveIntegrationRuntime": "DefaultIntegrationRuntime (East US)",
  "usedDataIntegrationUnits": 4,
  "usedParallelCopies": 4
}

Rows copied, throughput, DIUs used, parallel copy count, integration runtime — all in the output. After a completed run, this tells you almost everything you need for performance analysis.

Better Error Messages

V2 error messages include correlation IDs that link the ADF error to underlying service errors. When a Copy Activity fails with a SQL connection issue, the error message now includes enough context to find the corresponding error in SQL Database diagnostic logs. In v1, the error messages were frequently generic ("Activity failed") with no correlation ID and no path forward.

Correlation IDs aren't glamorous, but they save real time when you're debugging a 2am incident. "Activity failed" vs. "Activity failed, correlation ID: abc123, look in SQL Database logs for this ID" — the latter is the version you want.

Rerun from Monitor View

In v2, you can select a failed pipeline run from the monitoring view and rerun it directly — same parameters, same trigger context, new run ID. In v1, rerunning a failed slice required navigating multiple screens and understanding the slice model well enough to know which slice to re-enable.

For operations teams, this is meaningful. When a pipeline fails due to a transient issue (source database briefly unavailable, network hiccup), the recovery action is: wait for the transient condition to clear, go to monitor, click rerun. Under a minute of operational work instead of five minutes of navigation.

Trigger Run History

V2 maintains separate history views for trigger runs and pipeline runs. The trigger run view shows when each trigger fired, whether it successfully kicked off a pipeline, and if not, why. This separation is useful when debugging scheduling issues — you can verify that the trigger fired correctly and separately verify the pipeline run, rather than conflating the two.

What Didn't Improve

In-Progress Monitoring

This is the gap that costs me the most time. ADF still doesn't show row counts or byte counts for an active Copy Activity run. A copy that takes 30 minutes shows a spinner. That's it. You don't know if it's at 10% or 90%. You don't know if throughput has dropped. You don't know if it's about to fail or about to succeed.

The workaround I've been using: enable Azure Storage diagnostic logging and watch file creation events in Log Analytics for copies that go to Blob or ADLS. For SQL sink copies, I have a SQL query in a terminal window counting rows in the target table on a refresh loop. Both are manual and should not be necessary in 2017.

No Native Alerting

ADF has no built-in alerting. Pipeline failures don't send you an email. They don't send a Slack notification. Nothing. You either check the monitoring view manually or you build alerting yourself using Azure Monitor.

Azure Monitor integration works. Here's the pattern:

// Azure Monitor alert rule for ADF pipeline failures
// Resource: your ADF instance
// Signal: PipelineFailedRuns (metric)
// Condition: Count > 0, evaluation period 5 minutes
// Action group: email + webhook to Slack

The Azure Monitor setup is documented and works reliably. But it requires creating an Azure Monitor workspace, configuring diagnostic settings on your ADF instance, creating alert rules, and configuring action groups. That's four to six steps that every ADF user who cares about production reliability needs to do, and none of it is integrated into the ADF portal. It should be.

Run History Retention

ADF retains pipeline run history for 45 days. After 45 days, the run is gone from the monitoring view. For most operational purposes, 45 days is sufficient. For audit purposes — demonstrating that a pipeline ran successfully on a specific date three months ago — it's not.

The solution I've built for clients with audit requirements: a stored procedure activity at the end of every pipeline logs run metadata to an Azure SQL table.

CREATE TABLE dbo.PipelineRunLog (
    LogId           INT IDENTITY(1,1) PRIMARY KEY,
    PipelineName    NVARCHAR(255),
    RunId           NVARCHAR(100),
    TriggerName     NVARCHAR(255),
    WindowStart     DATETIME2,
    WindowEnd       DATETIME2,
    RowsProcessed   BIGINT,
    Status          NVARCHAR(50),
    RunDate         DATETIME2 DEFAULT GETUTCDATE()
);

Pass the pipeline and trigger metadata as parameters, call the stored procedure at pipeline end (using a dependsOn with condition Succeeded or Failed), and you have a permanent audit log. It's extra work but it's a pattern that belongs in every production ADF implementation.

The Monitoring Stack I Actually Use

In 2017, my production ADF monitoring setup:

  1. Azure Monitor alerts for pipeline failures → email + Slack webhook
  2. Custom audit log table in Azure SQL (stored procedure activity)
  3. Log Analytics workspace ingesting ADF diagnostic logs for correlation ID lookups
  4. A Power BI report querying the audit log table for operations dashboards

Items 2-4 are custom work that every production ADF customer is independently reinventing. Microsoft should build this into the platform. Until they do, build it yourself — the pattern is straightforward and the operational value is immediate. I'm here to help if you want the implementation details.

Read more