The ADLA Job That Did Nothing for Six Hours: Understanding U-SQL Execution Plans

Shannon Lowder

15 May 2015 — 3 min read

I need to tell you about the geocoding job. This is the story of a U-SQL job that ran all its vertices, consumed 200 Analytics Units for six hours, reported 100% completion, and produced a output file with zero rows. Not an error. Not a failure status. Success — and nothing in the output.

Understanding how this happened is one of the most useful things I can tell you about ADLA, because it reveals exactly how U-SQL execution works under the hood and why the monitoring story was inadequate in 2015.

The Job: Geocoding Storm Event Locations

The NOAA storm event data had county and state information but no lat/long coordinates. The goal was to geocode each storm event to a geographic point — merge the storm events with a county-level geocoding reference table to get coordinates for spatial analysis.

The U-SQL script looked straightforward:

@stormEvents =
    EXTRACT EventId int, State string, CountyFips string, ...
    FROM "/raw/noaa/storm-events/*.csv"
    USING Extractors.Csv(skipFirstNRows: 1, quoting: true);

@geocodes =
    EXTRACT CountyFips string, Latitude decimal, Longitude decimal
    FROM "/raw/reference/county-geocodes.csv"
    USING Extractors.Csv(skipFirstNRows: 1);

@geocoded =
    SELECT e.EventId,
           e.State,
           g.Latitude,
           g.Longitude
    FROM @stormEvents AS e
    JOIN @geocodes AS g
    ON e.CountyFips == g.CountyFips;

OUTPUT @geocoded
    TO "/curated/analytics/storm-events-geocoded.csv"
    USING Outputters.Csv(outputHeader: true);

Submitted. Ran for six hours. Output file: 0 bytes (just the header row).

The Actual Problem: JOIN Type Mismatch

The storm event data had CountyFips stored as a string. The geocode reference had CountyFips stored as a string. They looked like "37063" in the storm events and... also like "37063" in the geocodes? Run the job. Zero rows.

After hours of debugging, the issue: the storm events file had FIPS codes with leading zeros (01001). The geocode reference had them without leading zeros (1001). The join matched nothing. U-SQL's inner join produced zero rows, and since U-SQL's default join is inner, nothing was flagged as an error or warning. The job completed successfully because it did exactly what I told it to do.

The Monitoring Gap

This is the ADLA monitoring problem. A zero-row output on an inner join is a valid result — maybe every row legitimately had no match. ADLA had no way to know that was unexpected. The job graph showed all vertices green. Input bytes per vertex looked correct (the data was being read). Output bytes were zero — but zero is a valid output size if no rows match.

What was missing: row counts visible in the job monitoring UI. ADLA showed byte counts at each vertex but not row counts. A vertex that read 2GB of input and produced 0 bytes of output was invisible as anomalous until you already knew the output was wrong.

How to Catch It

Add explicit diagnostics to U-SQL scripts that might have join cardinality problems:

// Validate join cardinality before expensive downstream processing
@matchCount =
    SELECT COUNT(*) AS Matched
    FROM @stormEvents AS e
    JOIN @geocodes AS g
    ON e.CountyFips == g.CountyFips;

// Write diagnostic to a separate file — check this before proceeding
OUTPUT @matchCount
    TO "/staging/diagnostics/geocode-match-count.txt"
    USING Outputters.Tsv();

// Also sample unmatched to diagnose the mismatch
@unmatched =
    SELECT TOP 100 e.EventId, e.State, e.CountyFips
    FROM @stormEvents AS e
    LEFT OUTER JOIN @geocodes AS g
    ON e.CountyFips == g.CountyFips
    WHERE g.CountyFips == null;

OUTPUT @unmatched
    TO "/staging/diagnostics/geocode-unmatched-sample.csv"
    USING Outputters.Csv(outputHeader: true);

Run this diagnostic version first. Check the match count. If it's zero or unexpectedly low, look at the unmatched sample to see what the actual FIPS codes look like. Fix the normalization issue (CountyFips.PadLeft(5, '0') in this case), then run the full job.

The Lesson: Validate Before Scale

The expensive mistake was running the full six-hour job with bad join logic. The cheap version — the diagnostic script above — would have taken 20 minutes and told me the join was producing nothing before I committed to the full execution.

For any large U-SQL job with a JOIN, run the diagnostic pass first. Write the row counts and a sample of unmatched records before the main output. It adds 20 minutes to the development cycle and saves hours of waiting on broken jobs. That tradeoff is easy math. As always, I'm here to help.

The ADLA Job That Did Nothing for Six Hours: Understanding U-SQL Execution Plans

Shannon Lowder

The Job: Geocoding Storm Event Locations

The Actual Problem: JOIN Type Mismatch

The Monitoring Gap

How to Catch It

The Lesson: Validate Before Scale

Read more

The Context Problem Neither Agent Mesh Nor OpenSharing Solves

Unity AI Gateway and What a Governed Model Access Layer Actually Buys You

You Don't Need Fable. You Need a Router.

DAIS 2026: Genie One and the Context Problem Databricks Is Solving