The 5 Most Expensive Snowflake Mistakes
After analyzing many Snowflake environments, I consistently find the same mistakes. Each can represent 10-30% of the monthly bill.
In 90% of environments I audit, at least 3 of these 5 mistakes are present. It's not a question of team competence. It's just that Snowflake is configured by default for performance, not for cost.
Here are the 5 most common waste patterns, and how to fix them.
Mistake #1: Auto-suspend disabled or too long
The problem
By default, Snowflake creates warehouses with a 5-minute auto-suspend. Many admins disable it "to avoid startup times" or increase it to 10-15 minutes.
Result: The warehouse runs idle for hours.
I saw at one client an XLARGE warehouse with auto-suspend disabled "because users complained about the 2-second cold start". Cost of that decision: €18,000/month of pure waste. To avoid 2 seconds of waiting.
The calculation
A MEDIUM warehouse (4 credits/hour) with auto-suspend disabled:
Actual usage: 2h/day of queries
Time billed: 24h/day (never suspended)
Waste: 22h × 4 credits × $3 = $264/day = $7,920/month
The solution
-- Check all warehouses
SELECT
warehouse_name,
warehouse_size,
auto_suspend,
CASE
WHEN auto_suspend IS NULL THEN 'NEVER'
WHEN auto_suspend > 300 THEN 'TOO LONG'
ELSE 'OK'
END as status
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSES
WHERE deleted IS NULL;
-- Fix
ALTER WAREHOUSE my_warehouse SET AUTO_SUSPEND = 60;
Rule: 60 seconds for interactive workloads, 120 seconds maximum for batch.
My strong opinion: I NEVER recommend disabling auto-suspend, even on "critical" warehouses. Snowflake's cold start is 1-3 seconds. If your users can't wait 2 seconds, the problem isn't technical. It's a communication problem.
Typical savings: 20-40% of compute costs
Mistake #2: Oversized warehouses
The problem
"We took a LARGE to be sure we had enough power."
Except most queries don't need a LARGE. An X-SMALL or SMALL is enough for 80% of use cases.
The worst situation I've seen: a BI team with 15 analysts, each with their own dedicated MEDIUM warehouse. Average usage: 45 minutes per day. Monthly cost: €12,000. After consolidation to a single SMALL warehouse with auto-scaling: €800/month. Same perceived performance for users.
How to detect it
-- Analyze actual load per warehouse
SELECT
warehouse_name,
warehouse_size,
AVG(avg_running) as avg_concurrent_queries,
AVG(avg_queued_load) as avg_queue,
MAX(avg_queued_load) as max_queue
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_LOAD_HISTORY
WHERE start_time > DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY 1, 2
ORDER BY avg_concurrent_queries DESC;
Signs of oversizing:
avg_concurrent_queries< 1 → The warehouse is often idleavg_queue= 0 → No contention, no need for more power- Simple queries (< 10 sec) on a LARGE warehouse
The solution
-- Reduce size
ALTER WAREHOUSE analytics_wh SET WAREHOUSE_SIZE = 'SMALL';
-- Or use auto-scaling
ALTER WAREHOUSE analytics_wh SET
WAREHOUSE_SIZE = 'SMALL'
MIN_CLUSTER_COUNT = 1
MAX_CLUSTER_COUNT = 3
SCALING_POLICY = 'STANDARD';
Rule: Start small, scale up if queue > 0 consistently.
Typical savings: 50-75% on affected warehouses
Mistake #3: Excessive Time Travel
The problem
Snowflake keeps by default:
- Standard: 1 day of Time Travel
- Enterprise: up to 90 days of Time Travel
Many companies leave the Enterprise default (90 days) on ALL tables, including:
- Temporary staging tables
- Log tables
- Tables recreated daily
The calculation
A 100 GB table modified daily with 90 days of Time Travel:
Time Travel storage: 100 GB × 90 versions = 9 TB
Cost: 9 TB × $23/TB/month = $207/month for ONE table
I saw at an e-commerce client their click tracking table with 90 days of Time Travel. This table was truncated and reloaded every day. Result: 2.3 TB of Time Travel storage for a table with 25 GB of active data. Auditing their entire environment, I found 47 tables with the same problem. Total cost: €8,400/month of useless Time Travel.
How to detect it
-- Tables with excessive Time Travel storage
SELECT
table_catalog,
table_schema,
table_name,
ROUND(active_bytes / POW(1024, 3), 2) as active_gb,
ROUND(time_travel_bytes / POW(1024, 3), 2) as time_travel_gb,
ROUND(time_travel_bytes / NULLIF(active_bytes, 0), 1) as ratio
FROM SNOWFLAKE.ACCOUNT_USAGE.TABLE_STORAGE_METRICS
WHERE time_travel_bytes > active_bytes
ORDER BY time_travel_bytes DESC
LIMIT 20;
Red flag: ratio > 2 = Time Travel costs more than active data.
The solution
-- Reduce Time Travel on non-critical tables
ALTER TABLE staging.temp_data SET DATA_RETENTION_TIME_IN_DAYS = 1;
-- Temporary tables: 0 days
ALTER TABLE staging.daily_load SET DATA_RETENTION_TIME_IN_DAYS = 0;
-- Use TRANSIENT tables for staging
CREATE TRANSIENT TABLE staging.temp_data (...);
Typical savings: 15-30% of storage costs
Mistake #4: Queries without clustering on large tables
The problem
A 500 GB table without a clustering key. Every filtered query potentially scans the entire table.
Snowflake doesn't charge per TB scanned like BigQuery, but:
- More data scanned = longer execution time
- Longer time = more credits consumed
What teams often misunderstand: They think clustering is a "nice to have" optimization. In reality, on a table over 50 GB that's regularly filtered, NOT having a clustering key is a design error. Period.
How to detect it
-- Large tables without clustering
SELECT
table_catalog,
table_schema,
table_name,
ROUND(bytes / POW(1024, 3), 2) as size_gb,
clustering_key
FROM SNOWFLAKE.ACCOUNT_USAGE.TABLES
WHERE bytes > 10 * POW(1024, 3) -- > 10 GB
AND clustering_key IS NULL
AND deleted IS NULL
ORDER BY bytes DESC;
-- Long queries on these tables
SELECT
query_id,
query_text,
total_elapsed_time / 1000 as seconds,
bytes_scanned / POW(1024, 3) as gb_scanned
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY
WHERE start_time > DATEADD(day, -7, CURRENT_TIMESTAMP())
AND total_elapsed_time > 60000 -- > 1 minute
ORDER BY total_elapsed_time DESC
LIMIT 20;
The solution
-- Add a clustering key based on frequent filters
ALTER TABLE events CLUSTER BY (event_date, country);
-- Check effectiveness
SELECT SYSTEM$CLUSTERING_INFORMATION('events');
Rule: Cluster by the most filtered columns, in order of selectivity.
Typical savings: 30-60% of execution time on affected queries
Mistake #5: No Resource Monitors
The problem
Without resource monitors, nothing prevents:
- A runaway query from consuming 1000 credits
- A forgotten warehouse from running all weekend
- A user from launching massive queries by mistake
I've seen bills double over a weekend because of a misconfigured job. Literally: €45,000 gone between Friday evening and Monday morning because an infinite loop was relaunching the same ETL pipeline every 30 seconds.
The solution
-- Account-level monitor (safety net)
CREATE RESOURCE MONITOR account_safety
WITH CREDIT_QUOTA = 5000
FREQUENCY = MONTHLY
START_TIMESTAMP = IMMEDIATELY
TRIGGERS
ON 75 PERCENT DO NOTIFY
ON 90 PERCENT DO NOTIFY
ON 100 PERCENT DO SUSPEND;
ALTER ACCOUNT SET RESOURCE_MONITOR = account_safety;
-- Monitor per critical warehouse
CREATE RESOURCE MONITOR prod_etl_monitor
WITH CREDIT_QUOTA = 500
FREQUENCY = MONTHLY
TRIGGERS
ON 80 PERCENT DO NOTIFY
ON 100 PERCENT DO SUSPEND;
ALTER WAREHOUSE prod_etl_wh SET RESOURCE_MONITOR = prod_etl_monitor;
Recommended levels
| Level | Quota | Action |
|---|---|---|
| 75% | - | Notification to FinOps team |
| 90% | - | Notification + investigation |
| 100% | - | Suspend (except critical prod) |
Typical savings: Protection against incidents (potentially thousands of €)
Quick diagnostic checklist
-- Complete diagnostic script
-- Run to quickly identify problems
-- 1. Misconfigured warehouses
SELECT warehouse_name, warehouse_size, auto_suspend,
CASE WHEN auto_suspend IS NULL OR auto_suspend > 300 THEN 'KO' ELSE 'OK' END as status
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSES WHERE deleted IS NULL;
-- 2. Top consumers (last 7 days)
SELECT warehouse_name, SUM(credits_used) as credits
FROM SNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_METERING_HISTORY
WHERE start_time > DATEADD(day, -7, CURRENT_TIMESTAMP())
GROUP BY 1 ORDER BY 2 DESC LIMIT 10;
-- 3. Excessive Time Travel
SELECT table_name,
ROUND(time_travel_bytes / POW(1024, 3), 1) as tt_gb,
ROUND(active_bytes / POW(1024, 3), 1) as active_gb
FROM SNOWFLAKE.ACCOUNT_USAGE.TABLE_STORAGE_METRICS
WHERE time_travel_bytes > active_bytes
ORDER BY time_travel_bytes DESC LIMIT 10;
-- 4. Existing resource monitors
SELECT name, credit_quota, frequency, suspend_at, suspend_immediate_at
FROM SNOWFLAKE.ACCOUNT_USAGE.RESOURCE_MONITORS;
Honorable mentions
These mistakes didn't make the top 5, but I see them regularly:
-
Single warehouse shared between dev/prod: Dev queries block prod pipelines. A dedicated warehouse per environment costs less than a production incident.
-
Multi-cluster disabled on BI warehouse: 10 analysts waiting in queue on a single-cluster warehouse. Auto-scaling would have cost less than the lost time.
-
COPY INTO without purge: Files remain in the stage after loading. I've seen 12 TB of "temporary" CSV files accumulated over 18 months.
-
Production clones without thinking: A clone costs nothing at first, but as soon as data diverges, storage explodes. I've seen test environments cost more than prod.
-
Query acceleration enabled everywhere: The feature is billed per use. On already fast queries, it's pure waste.
Summary
| Mistake | Typical impact | Fix | Difficulty |
|---|---|---|---|
| Auto-suspend | 20-40% compute | 1 SQL command | Easy |
| Oversizing | 50-75% on WH | Analysis + resize | Medium |
| Time Travel | 15-30% storage | Per-table policy | Medium |
| No clustering | 30-60% on queries | ALTER CLUSTER BY | Medium |
| No monitors | Unlimited risk | CREATE MONITOR | Easy |
Total potential: 30-60% savings on the overall bill.
Conclusion
These 5 mistakes are present in almost every Snowflake environment I audit. The good news: they're all fixable in a few hours.
Auto-suspend at 60 seconds and resource monitors are the most impactful quick wins. Start there. You'll see results on your next bill.
If this type of content helps you, you can follow me on LinkedIn or subscribe to the newsletter for future articles.
Questions about your Snowflake environment? Feel free to reach out directly. I respond to all messages.