DESCRIBE DETAIL: Reading Delta Table Properties and What They Mean
DESCRIBE TABLE in SQL Server gives you columns and data types. In Delta Lake, there's a related but more detailed command called DESCRIBE DETAIL that tells you things about the physical storage of your table — information that directly affects query performance. It's one of those commands you learn about late and then use constantly.
DESCRIBE DETAIL
DESCRIBE DETAIL ordersOr from Python:
from delta.tables import DeltaTable
detail = DeltaTable.forName(spark, "orders").detail()
display(detail)The output is a single row with columns that are easy to ignore until you know what they mean.
What the Columns Tell You
numFiles — the current number of Parquet files that make up the table. If this is 50,000 and your table has 10 million rows, you have a small-files problem. Ideal is roughly 100MB–1GB per file. Run OPTIMIZE if this number is very high relative to table size.
sizeInBytes — total size of all active data files, in bytes. Divide by numFiles to get average file size. Divide by 1,073,741,824 for GB.
detail_row = detail.collect()[0]
total_gb = detail_row.sizeInBytes / 1_073_741_824
avg_file_mb = (detail_row.sizeInBytes / detail_row.numFiles) / 1_048_576
print(f"Table size: {total_gb:.2f} GB across {detail_row.numFiles} files, avg {avg_file_mb:.1f} MB/file")partitionColumns — which columns the table is partitioned by. If this is empty and your table is large, consider whether partitioning would help your most common query patterns.
location — the actual storage path. Useful for confirming the table is pointing at the right storage location, especially after migrations or accidental table drops and recreates.
minReaderVersion / minWriterVersion — the Delta protocol versions required to read or write this table. If you're mixing Databricks Runtime versions across environments, this tells you the minimum version capable of accessing the table. Certain features (deletion vectors, column mapping) raise the protocol version and make the table inaccessible to older runtimes.
tableProperties — a map of all table-level configurations. This is where you'd see custom settings like delta.logRetentionDuration, delta.autoOptimize.optimizeWrite, or any properties you've set with ALTER TABLE.
The File Health Check Workflow
I run DESCRIBE DETAIL on a table when:
- Queries on it are slower than expected — high numFiles is usually why
- It's been running in production for a month and I haven't checked its physical state
- A pipeline that writes to it has been running without OPTIMIZE on a schedule
-- The full diagnostic pass
DESCRIBE DETAIL orders; -- physical state
DESCRIBE HISTORY orders LIMIT 5; -- recent operations
VACUUM orders DRY RUN; -- what VACUUM would clean upThose three commands together give you everything you need to diagnose performance issues and decide whether you need OPTIMIZE, VACUUM, or a repartitioning strategy. As always, I'm here to help.