The Snowflake Credit Card: When Convenience Becomes a Dependency
I want to talk about Snowflake — not to dismiss it, because it genuinely solves real problems, but to be precise about the nature of the relationship you're entering when you build on it. "Convenience" and "dependency" are not the same word, but in practice they often point to the same situation.
Snowflake launched publicly in late 2014. The architecture is clever: storage lives in S3 (or Azure Blob, depending on your cloud), compute runs in isolated virtual warehouses that scale independently of each other and of the storage layer. You can spin up a large warehouse to run an expensive query and shut it down when it's done. You pay for what you use. Multiple teams can run queries simultaneously without competing for the same pool of compute. For an analytics team that has been fighting over a shared Redshift cluster, this sounds like paradise.
It is. Until you look at the bill. Then it sounds like something else.
How the Credits Work
Snowflake billing is denominated in credits. One credit equals one hour of one compute node running at the smallest warehouse size (XS). The price per credit varies by cloud provider and region but runs roughly $2–$4 per credit on a pay-as-you-go basis. An XS warehouse uses 1 credit per hour. An XL warehouse uses 16 credits per hour. A 4XL uses 128 credits per hour.
The model is transparent on paper. In practice, it rewards non-obviousness. A query that runs on an XL warehouse for 6 minutes uses 1.6 credits. That same query, hitting an unclustered table that forces a full scan, might take 40 minutes on an XS warehouse and use 0.67 credits. Or it might spill to disk and take 90 minutes. The only way to know is to run it, watch the credit burn, and tune — which itself burns credits.
The optimization loop is: run query, see credit cost, adjust warehouse size or clustering keys or query structure, run again. Repeat. Every iteration costs money.
The Format Is Not Open
Here is the lock-in question the sales conversation glosses over: your data in Snowflake is stored in Snowflake's proprietary micro-partition format. You cannot read this format with Spark. You cannot read it with Presto. You cannot read it with anything except Snowflake.
If you want to leave Snowflake — price increase, company acquisition, feature gap, compliance requirement — you export your data with COPY INTO to S3, then reload it into whatever comes next. For a large data warehouse, that export is a project: it takes time, it costs credits, and depending on how much data you have, the export itself may run up a substantial bill.
Your data is not hostage in the sense that Snowflake will refuse to give it back. It's hostage in the sense that getting it back requires a migration project rather than just pointing a different processing engine at the same storage layer.
SQL Only, By Design
Snowflake is a SQL-first platform. Your data scientists who want to run Python, your engineers who want to write Spark transformations, your ML team who needs to call Python libraries — they don't get to do that inside Snowflake. They export data out, process it in their environment, and (maybe) load results back in.
That's a seam in your architecture. Every seam is a place where data gets copied, latency is introduced, and lineage gets broken. If your analytics layer and your ML infrastructure are perpetually extracting data from Snowflake rather than processing it in place, you're paying for the extract on every cycle.
The Right Use Case
None of this means Snowflake is the wrong tool. It is an excellent tool for analytics teams that live entirely in SQL, have workloads with variable query demand, and have an organization that can absorb the credit cost model. The virtual warehouse architecture is genuinely good for the workloads it was designed for.
What it's not is a neutral, open foundation you can build the rest of your data infrastructure on without accepting a specific set of constraints. Understand those constraints before you're three years in with a warehouse that costs $40K a month and a team that has never needed to think about what happens if that price doubles.
If you're evaluating Snowflake right now and want to pressure-test the lock-in surface, I'm happy to walk through it with your specific use case. As always, I'm here to help.