Building a Metadata-Driven SSIS Generator: The Architecture

After inheriting the 200-package SSIS environment, the solution was clear: generate the packages programmatically from a metadata store rather than writing them by hand. The resulting system was one of the highest-leverage things I built during this period — adding a new source to the warehouse went from a half-day manual task to a metadata entry and a generation run. Here's how the architecture worked.

The Metadata Store Design

The metadata store was a SQL Server database (on Azure SQL by this point) with tables describing the data sources, transformations, and destinations. The core tables:

-- Source connection details
CREATE TABLE meta.DataSources (
    DataSourceId     INT           NOT NULL IDENTITY PRIMARY KEY,
    DataSourceName   NVARCHAR(100) NOT NULL,
    ConnectionString NVARCHAR(500) NOT NULL,
    DatabaseName     NVARCHAR(100) NOT NULL,
    IsActive         BIT           NOT NULL DEFAULT 1
);

-- Tables to extract from each source
CREATE TABLE meta.SourceTables (
    SourceTableId     INT           NOT NULL IDENTITY PRIMARY KEY,
    DataSourceId      INT           NOT NULL REFERENCES meta.DataSources,
    SchemaName        NVARCHAR(100) NOT NULL,
    TableName         NVARCHAR(100) NOT NULL,
    IncrementalColumn NVARCHAR(100) NULL,
    LoadType          NVARCHAR(20)  NOT NULL  -- 'Full' or 'Incremental'
);

-- Column-level mappings and transformations
CREATE TABLE meta.ColumnMappings (
    ColumnMappingId     INT           NOT NULL IDENTITY PRIMARY KEY,
    SourceTableId       INT           NOT NULL REFERENCES meta.SourceTables,
    SourceColumnName    NVARCHAR(100) NOT NULL,
    DestColumnName      NVARCHAR(100) NOT NULL,
    DataType            NVARCHAR(50)  NOT NULL,
    IsNullable          BIT           NOT NULL DEFAULT 1,
    TransformExpression NVARCHAR(500) NULL
);

The Generation Approach

SSIS packages are DTSX files — which are XML. The Microsoft.SqlServer.ManagedDTS library exposes a managed API for creating SSIS packages in code without touching XML directly.

The generator was a C# console application that queried the metadata store for all active source tables, then for each source table instantiated a Package object, applied standard control flow elements from a template, added the data flow task configured from the metadata, and saved the resulting DTSX file.

The Standard Control Flow Template

The key insight was the template object. Rather than writing code to add logging, error handling, and standard parameters to every generated package individually, a method applied a standard control flow template to any package:

private static void ApplyStandardTemplate(Package package, string packageName,
    string logConnectionString)
{
    // Standard package logging
    var logProvider = package.LogProviders.Add("DTS.LogProviderSQLServer.3");
    logProvider.ConfigString = logConnectionString;
    package.LoggingMode = DTSLoggingMode.Enabled;

    // Package-level variables for audit tracking
    package.Variables.Add("PackageName", false, "User", packageName);
    package.Variables.Add("LoadStartTime", false, "User", DateTime.MinValue);
    package.Variables.Add("RowsExtracted", false, "User", 0);
    package.Variables.Add("RowsLoaded", false, "User", 0);

    // Standard OnError event handler
    var errorHandler = package.EventHandlers.Add("OnError");
    // ... add email task to errorHandler
}

Every generated package got this template applied. Logging: consistent. Error handling: consistent. Audit variables: consistent. When we needed to change the logging schema six months later, we changed the template and regenerated — not 200 individual packages.

What Regeneration Meant

The packages were generated artifacts. They lived in source control but were generated from the metadata store, not authored by hand. When requirements changed — add the audit column, change the merge logic, update the error handler email address — the change was made once (in the metadata store or in the generator's template), and all affected packages were regenerated in a single run.

Adding a new source table: add a row to meta.SourceTables, add the column mappings to meta.ColumnMappings, run the generator. One new package, consistent with the other 200. That's the payoff. As always, I'm here to help.

Read more