Data Quality vs. Data Testing: Understanding What Great Expectations Actually Solves

There's a distinction worth making explicit before going further with Great Expectations: data quality monitoring and data testing are related but not the same thing. Conflating them leads to setting up the wrong tool for the job — or expecting too much from the right tool.

Data Quality Monitoring

Data quality monitoring is an operational concern. It asks: over time, is the data in my systems staying within acceptable parameters? Are null rates trending up? Is row count drifting? Are value distributions shifting in unexpected ways?

Monitoring is usually continuous and statistical. You're tracking metrics over time and alerting when they drift outside a threshold. Tools purpose-built for monitoring keep history, visualize trends, and alert on anomalies.

Data Testing

Data testing is a development and pipeline concern. It asks: at this specific moment, when this specific dataset arrives or leaves my pipeline, does it conform to the contract I've defined? Did something go wrong between when the data was produced and when I'm about to use it?

Testing is point-in-time and binary: the dataset either meets the expectations or it doesn't. You run the test, you get a result, you decide whether to proceed.

Great Expectations is primarily a data testing tool. It gives you the ability to assert data contracts at pipeline boundaries — at ingestion, at transformation output, at delivery to a downstream consumer. It's not primarily a monitoring dashboard (though it can generate useful reports, which I'll cover later).

Where Data Testing Fits in a Pipeline

Think of your data pipeline as having clear handoff points: raw data arrives → transformations are applied → clean data is delivered to a downstream consumer. Each of those handoffs is a place where a data contract can be validated:

import great_expectations as ge

def ingest_storm_data(filepath):
    df = ge.read_csv(filepath)

    # Gate 1: Raw data contract — does what arrived match what we expected?
    raw_result = df.validate(expectation_suite='suites/storm_raw.json')
    if not raw_result['success']:
        raise ValueError(f"Raw data failed validation: {filepath}")

    # ... apply transformations ...
    clean_df = transform_storm_data(df)

    # Gate 2: Transformed data contract — did our transforms produce valid output?
    clean_ge = ge.from_pandas(clean_df)
    clean_result = clean_ge.validate(expectation_suite='suites/storm_clean.json')
    if not clean_result['success']:
        raise ValueError("Transformed data failed output validation")

    return clean_df

Two suites, two validation gates, two distinct contracts. The raw suite documents what the source is supposed to deliver. The clean suite documents what your transformations are supposed to produce. If either fails, you know at which stage and why.

What Great Expectations Doesn't Do

It's worth being direct about the gaps. Great Expectations doesn't:

  • Track data quality metrics over time automatically (you'd build that on top of validation results)
  • Alert your on-call team when thresholds drift
  • Diagnose why data is bad — it tells you that it's bad and what the violation looks like
  • Fix bad data (obviously)

What it does exceptionally well: define a data contract in code, run that contract against any compatible dataset, and produce a structured result that tells you exactly what passed, what failed, and what the failing values looked like. That's the core capability. Everything else is built on top of it.

The analogy to software testing holds: a unit test suite doesn't monitor your production application for anomalies. It verifies that a specific version of your code produces a specific expected output. Data testing is the same idea applied to data. As always, I'm here to help.

Read more