ADF Connector Update 2016: What You Can Actually Connect To Now

When ADF shipped in 2014, the connector library was the charitable way to describe it. There were enough sources to make a convincing demo. There weren't enough to handle real enterprise integration scenarios without falling back to custom code on a significant fraction of your workloads.

Two years later, that has changed. Let me walk through where the connector library stands in 2016, what I actually use in production, and what's still missing.

Sources I Use Regularly

Azure Blob Storage

The foundational connector. Reads delimited files, JSON, ORC, Parquet, Avro. Handles compression (gzip, bzip2, zip). Supports wildcard path patterns for reading partitioned data. This is the connector you use constantly — for landing zone patterns (dump files here, ADF picks them up), for staging (write intermediate results to Blob, read them in the next stage), and for any file-based source.

Azure Data Lake Store

ADLS Gen1 is increasingly the preferred landing zone for analytics workloads. The ADF connector handles it cleanly. Service principal authentication works and is the right approach — storage account key auth is convenient for development and wrong for production.

Azure SQL Database and Azure SQL Data Warehouse

These work well. The DW connector supports PolyBase for bulk loads, which is the right approach for any significant data volume. Direct INSERT loading into DW is slow at scale — use PolyBase. The connector handles the staging automatically: write to Blob, stage via PolyBase. Configure it with allowPolyBase: true in the copy activity sink and it does the right thing.

SQL Server via Self-Hosted IR (Data Management Gateway)

Still the workhorse for on-premises SQL Server integration. After the initial setup pain — installing the gateway on a Windows Server VM, registering it with ADF, configuring the linked service — it's reliable. Push predicates down with a SQL query source instead of table source to avoid pulling the entire table across the wire. Simple, right?

Oracle via DMG

Works. Requires the Oracle Data Access Client installed on the gateway VM. The dependency management is manual and underdocumented — you find out about the ODAC requirement by failing, not by reading the docs. Once installed, it's stable. Performance depends heavily on the query you send — Oracle query optimization matters before the data hits the network.

Salesforce

The Salesforce connector uses SOQL (Salesforce Object Query Language) as the query language. You write SOQL, ADF executes it against the Salesforce API, you get data back. Works for standard objects and custom objects. API rate limiting is a real concern for large extracts — ADF doesn't handle Salesforce's rate limits gracefully, so you need to be deliberate about concurrency and slice sizing.

FTP and SFTP

File transfer protocol connectors. They work. SFTP supports key-based authentication. For vendor file drops, these are often the right answer. The main limitation: you're reading files as-is — ADF doesn't do any parsing on the FTP server side. The file comes across the wire, then ADF reads it as a blob source.

New Additions Worth Knowing

Amazon S3

Cross-cloud is interesting. ADF can now read from S3 and write to Azure storage. For customers migrating from AWS to Azure, or for organizations running hybrid multi-cloud workloads, this closes a gap that previously required custom code or a third-party tool. The connector uses S3 access key authentication. It's not as feature-rich as the native Azure Blob connector, but for straightforward file copy scenarios it works.

PostgreSQL

Via the DMG. Requires the Npgsql driver on the gateway VM. Same pattern as the Oracle connector — install the driver, configure the linked service, write queries. PostgreSQL is increasingly common in the clients I work with, particularly in organizations that have standardized on open-source databases. This connector was a real gap in 2014.

DB2 and Sybase

Enterprise database connectors that matter for clients with legacy mainframe-adjacent systems. DB2 shows up constantly in financial services and insurance clients. The connector works via the DMG with the appropriate IBM driver installed. Not glamorous, but closing these gaps means ADF is a viable integration platform for the systems that actually run enterprise operations.

Sinks: Where Data Can Land

The sink story mirrors the source story. Azure Blob, ADLS, Azure SQL, Azure SQL Data Warehouse (via PolyBase — use it), SQL Server via DMG, DocumentDB (now Cosmos DB). The key addition in this category: Azure SQL Data Warehouse via PolyBase is now the documented best practice and the connector supports it directly. Loading 100M rows into DW via PolyBase takes minutes. Via direct INSERT it takes hours. Use PolyBase.

What's Still Missing

SAP. The SAP ecosystem — SAP HANA, SAP BW, SAP ERP — is not natively connected in ADF v1. For clients with SAP systems, which includes most large manufacturers, retailers, and utilities I work with, this is a significant gap. You work around it by extracting data from SAP via a different mechanism (SAP Data Services, custom ABAP extraction, RFC calls) and landing it in a location ADF can reach. It works. It's not clean.

Complex REST APIs. ADF has an HTTP connector for simple REST endpoints. For APIs that require OAuth flows, custom header construction, pagination, or response transformation, you're writing a Custom Activity. The connector handles simple GET-and-parse scenarios; anything more complex requires code.

Network file shares. Reading from \serversharepathile.csv requires the DMG installed on a machine that can see the share. This works, but it means you can't use the cloud-hosted Azure IR for these sources — you're always touching on-premises infrastructure.

Bottom Line

The ADF connector library in 2016 is legitimately good for most enterprise integration scenarios. Two years ago the connector gaps were a limiting factor. Today, for most clients I work with, ADF can connect to every source and sink in scope without custom code. SAP is the notable exception.

The connector library is no longer the reason not to use ADF. The reasons not to use ADF in 2016 are the parameterization gap, the monitoring gaps, and the git story. Those are architecture and platform maturity issues, not connector issues. I've been writing about them for two years and I'll keep writing about them until they're fixed. If you're evaluating ADF for a new project and want to talk through the connector story for your specific sources, I'm here to help.