Retour au blog
BigQuery

Anatomy of a €50k/month BigQuery Bill

Detailed breakdown of a typical scale-up BigQuery bill. Where the money goes, how to detect it, and quick wins to reduce by 40-60%.

15 février 202615 min de lecture

Anatomy of a €50k/month BigQuery Bill

I regularly see BigQuery bills from growing scale-ups. The pattern is often the same: a bill that exploded without anyone really understanding why.

In 90% of the cases I audit, the bill tripled in less than a year, and the data team discovers the problem when the CFO shows up with questions.

In this article, I break down a typical €50k/month bill to show where the money goes and where the optimizations are hiding.


The typical profile

Let's imagine a classic e-commerce scale-up:

  • Size: 200 employees, 50 active BigQuery users (including 8 in the core data team)
  • Data team: 2 Data Engineers, 4 Analysts, 2 Data Scientists. The rest: Product, Finance, Marketing with Looker access
  • Volume: 2 TB of new data per day
  • Stack: BigQuery + dbt + Looker
  • History: 3 years of data, ~500 TB total

The monthly bill just went from €15k to €50k in 8 months. Nobody knows exactly why.

I saw this exact situation at a client: the Data Lead was convinced it was the data volume that had exploded. In reality, 60% of the increase came from a single poorly configured Looker dashboard, refreshing every 15 minutes for 30 users.


Bill breakdown

Overview

MONTHLY TOTAL: €52,340

├── BigQuery Analysis (Queries)    €35,200   67%
│   ├── On-demand queries          €28,000
│   └── BI Engine                   €7,200
│
├── BigQuery Storage               €8,500    16%
│   ├── Active storage             €6,200
│   └── Long-term storage          €2,300
│
├── Dataflow                       €4,800     9%
│
├── Cloud Storage (GCS)            €2,400     5%
│
└── Other (Pub/Sub, etc.)          €1,440     3%

First observation: 67% of costs come from queries. That's where you need to dig.

Note on BI Engine: That €7,200 corresponds to a 100 GB BI Engine memory reservation (~$0.0416/GB/hour × 730h × 100 GB). BI Engine accelerates Looker queries by keeping data in memory. The problem: many teams enable BI Engine "for performance" without checking if their dashboards actually benefit. In this case, only 3 out of 25 dashboards were actually using the BI Engine cache.


Compute analysis (€35k)

Who's consuming?

-- Top 10 consumers (last 30 days)
-- EUR price: $6.25/TiB × 0.92 EUR/USD ≈ €5.75/TiB
SELECT
  user_email,
  COUNT(*) as query_count,
  SUM(total_bytes_billed) / POW(1024, 4) AS tb_scanned,
  ROUND(SUM(total_bytes_billed) / POW(1024, 4) * 5.75, 2) AS cost_eur
FROM `region-eu`.INFORMATION_SCHEMA.JOBS
WHERE creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
  AND job_type = 'QUERY'
GROUP BY user_email
ORDER BY cost_eur DESC
LIMIT 10;

Typical result:

UserQueriesTB scannedCost
scheduler@project.iam45,0002,100€12,075
looker@project.iam128,0001,450€8,337
analyst-marie@3,200890€5,117
analyst-paul@2,800620€3,565
dbt-prod@1,200480€2,760

Insights:

  • The scheduler (Airflow/dbt) represents 35% of costs → pipelines to audit
  • Looker runs 128k queries → lots of cache misses, poorly optimized dashboards
  • 2 analysts consume as much as the rest of the team → unoptimized exploratory queries

Which queries cost the most?

-- Top 20 most expensive queries
SELECT
  job_id,
  user_email,
  ROUND(total_bytes_billed / POW(1024, 4) * 5.75, 2) AS cost_eur,
  ROUND(total_bytes_billed / POW(1024, 3), 1) AS gb_scanned,
  SUBSTR(query, 1, 200) AS query_preview
FROM `region-eu`.INFORMATION_SCHEMA.JOBS
WHERE creation_time > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
  AND job_type = 'QUERY'
ORDER BY total_bytes_billed DESC
LIMIT 20;

Recurring patterns:

  1. SELECT * on unpartitioned tables: One query, 2 TB scanned, €12
  2. Dashboards without date filters: Each refresh scans the entire history
  3. Poorly ordered JOINs: Combinatorial explosion
  4. Identical recurring queries: Same query 50x/day without cache

Strong opinion: I strongly advise against on-demand mode beyond €10k/month in consumption. At that point, switch to BigQuery Editions (Standard or Enterprise). On-demand is comfortable but it's a blank check. I've seen too many teams get surprised by a 3x bill after a pipeline incident or an overly curious intern.


Storage analysis (€8.5k)

Active vs long-term breakdown

Typical problems:

DatasetActiveLong-termProblem
raw_events180 TB20 TBNo lifecycle
staging45 TB0 TBTemp tables never deleted
snapshots80 TB10 TBSnapshots kept indefinitely

The Active/Long-term ratio should be ~30/70, here it's the opposite.

The last time I saw this pattern, the client was paying €3,000/month for staging tables created by a former employee who left 18 months ago. Nobody dared delete them "just in case".


The 6 quick wins

1. Enable require_partition_filter (savings: ~€8k/month)

ALTER TABLE `project.dataset.events`
SET OPTIONS (require_partition_filter = TRUE);

What people often misunderstand: Many think partitioning is enough. It's not. Without require_partition_filter, your users can still scan the entire table. They will, out of laziness or by mistake. Force the constraint, or it doesn't exist.

2. Fix Looker dashboards (savings: ~€4k/month)

  • Mandatory date filters
  • Materialized Views for common aggregations
  • BI Engine on frequent tables

3. Optimize dbt pipelines (savings: ~€3k/month)

# dbt_project.yml
models:
  +partition_by:
    field: event_date
    data_type: date
  +cluster_by: ["country", "event_type"]

4. Clean up staging (savings: ~€1.5k/month)

ALTER TABLE `project.staging.temp_table`
SET OPTIONS (expiration_timestamp = TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 7 DAY));

5. Lifecycle policy on raw_events (savings: ~€2k/month)

ALTER TABLE `project.raw.events`
SET OPTIONS (partition_expiration_days = 730);

6. Train the teams (savings: ~€3k/month)

  • No SELECT *
  • Always filter on the partition
  • Dry run before execution

Action plan

WeekActionEstimated savings
1Partition filters€8,000
2Top 5 dashboards€4,000
3dbt pipelines€3,000
4Cleanup + lifecycle€3,500
5Team training€3,000

Total: ~€21,500/month = 41% reduction


Conclusion

A €50k BigQuery bill is rarely inevitable. 30-50% can be saved with:

  1. Enforcing best practices (partition filters)
  2. Optimizing heavy consumers (dashboards, pipelines)
  3. Cleaning up storage (lifecycle, zombie tables)
  4. Training teams (cost-aware culture)

The hardest part isn't technical. It's maintaining these practices over time. I've seen teams clean up once, save 40%, then fall back to the same level 6 months later because nobody was monitoring.

If this type of content helps you, you can follow me on LinkedIn or subscribe to the newsletter for future articles.

Jonathan Kini

Jonathan Kini

J'aide les équipes data à réduire et maîtriser leurs coûts BigQuery et Snowflake, sans sacrifier la performance. De la startup aux environnements data à grande échelle.