I've been writing about tSQLt since 2012 and Great Expectations since 2017. I've used both across different clients and projects, and I get asked regularly: which one should I use? The answer is almost always: both, depending on where in your stack you are.
This post is the explicit comparison I've been building toward for three years.
The Same Principle
Both frameworks exist to answer the same question: does my data behave the way I expect it to? Both use an assertion model — you define a claim about your data, the framework checks it, and you get a structured pass/fail result. Both integrate with CI pipelines. Both produce output that tells you exactly which assertion failed and why.
The principle is identical. The implementation is different because the runtime is different.
What's Actually Different
| Dimension | tSQLt | Great Expectations |
|---|---|---|
| Language | T-SQL | Python |
| Runtime | SQL Server (CLR) | Python process; pushes to DB via SQLAlchemy |
| What it tests | Stored procedure / trigger / view logic | Data in any datasource (files, DataFrames, SQL tables) |
| Setup cost | CLR enabled, TRUSTWORTHY ON, install assembly | pip install, configure datasource |
| Test isolation | Transaction rollback (automatic) | No built-in isolation — you manage test data |
| Suite format | SQL stored procedures in schemas | JSON files |
| CI integration | sqlcmd + JUnit XML | Python script + exit code |
| Failure output | Test runner results table | ValidationResult JSON + Data Docs HTML |
Where Each One Belongs
Use tSQLt when the logic lives in SQL Server: stored procedures with branching logic, triggers, functions. If you're asking "does my GetPendingOrders stored procedure correctly filter by status?" — that's tSQLt territory. The logic is in the database; the test should be too.
Use Great Expectations when you're asserting properties of data at rest or in motion, regardless of where it lives: a CSV file arriving from a vendor, the output of a Python transformation, the contents of a SQL table after a load. If you're asking "does this data look like what I expected?" — that's GE territory. The question is about the data, not about the logic that produced it.
The practical overlap in a SQL Server data engineering shop:
# Great Expectations: assert the incoming CSV matches the contract
validate_raw_file('storm_events_2020_q3.csv', 'suites/storm_raw.json')
# Transform the data (Python)
clean_df = transform_storm_data(raw_df)
# Great Expectations: assert the transformation output is valid
validate_output(clean_df, 'suites/storm_clean.json')
# Load to SQL Server (via SSIS or sqlalchemy)
load_to_sql_server(clean_df, 'dbo.StormEvents')
# tSQLt: assert the stored procedures that process the loaded data work correctly
# (runs in CI, validates dbo.GetStormSummary, dbo.CalculateRisk, etc.)
The Shared Lesson
The reason I kept writing about tSQLt after 2012, and started writing about Great Expectations in 2017, isn't that these specific tools are irreplaceable. It's that the practice of encoding data contracts as executable assertions — and running those assertions automatically — is genuinely different from hoping data is correct. The tool is secondary. The habit is primary.
Whether you're asserting in T-SQL inside a transaction rollback, or in Python against a JSON expectation suite, you're doing the same thing: making your expectations about your data explicit and verifiable rather than implicit and discovered when production breaks.
Pick the right tool for the layer you're working in. Use both if your stack spans both. As always, I'm here to help.