databricks-cost-leak-hunter sample output source-cited design review · #790

A $100K/mo workspace is likely burning ~$27,000/month

that's ~$324K/year. The $100K/mo spend is the only assumed input; the 27% waste rate is published.1 Every line below is one config change.

#Where it's leaking$/monthThe fix
1 Clusters that never auto-terminateIdle compute is one of the largest cloud-waste categories; utilization is chronically low39 $12,000 Set auto-termination to 30 min
2 Scheduled jobs on All-Purpose ComputeBilled at $0.55/DBU vs $0.15/DBU for Jobs Compute — 2–3× more for the same work456 $7,000 Switch job clusters to Jobs Compute
3 Clusters sized for peak, idling below thresholdProduction is typically overprovisioned 30–50%310 $5,000 Turn on autoscaling, drop the floor
4 Photon billed at ~2× DBU on jobs it doesn't accelerateThe premium only pays off at a ≥2× speedup76 $3,000 Disable Photon where it adds no runtime gain

The #1 line alone — auto-termination — is ~$144K/year, fixed in one setting.

For scale: Nucleus Research independently measured a 375% ROI / 6-month payback for one Databricks customer8 — getting the platform's cost posture right has real, audited upside.

What's assumed, what's cited. The $100K/month workspace spend is the only assumed input — your number goes here. The 27% waste rate, the $0.55-vs-$0.15 rate gap and 2–3× multiplier, the ~2× Photon premium, and the idle-and-overprovisioned-dominate ranking are all from published sources (numbered below). The per-row dollar split is an illustrative allocation of the $27K, ranked by documented waste-category size. When the skill runs, every dollar figure is computed from the customer's own system.billing.usage table — never estimated.

Sources

  1. Flexera, 2025 State of the Cloud Report — respondents estimate 27% of cloud spend is wasted, and 84% say managing cloud spend is the top cloud challenge. Press release · Report PDF
  2. CloudZero — Reduce Cloud Waste — cloud waste runs ~32% of spend (up to one-third), over $200B globally, corroborating Flexera. cloudzero.com/blog/cloud-waste
  3. CloudZero — Cloud Rightsizing — overprovisioned and idle resources are the largest waste categories: production is typically overprovisioned 30–50% (non-production 70%+), and on Kubernetes average CPU utilization is ~10% — "90 cents of every dollar spent on Kubernetes compute buys idle capacity." cloudzero.com/blog/cloud-rightsizing
  4. Flexera, Databricks pricing guide (2026)All-Purpose Compute $0.55/DBU, Jobs Compute $0.15/DBU (AWS Premium); "Using All-Purpose Compute for jobs that belong on Jobs Compute can cost 2 to 3 times more for the same workload." flexera.com/blog/finops/databricks-pricing-guide
  5. CloudZero, Databricks pricing guide — "All-Purpose Compute clusters … can cost 2–3X more per DBU than Jobs Compute clusters used for automated pipelines." cloudzero.com/blog/databricks-pricing
  6. Databricks, Best Practices for Cost Management (2022) — customers "saved tens of thousands of dollars by simply moving just ten percent of their workloads from all-purpose clusters to jobs clusters"; Photon delivers a 3–8× performance gain; spot instances give up to 90% off VM compute. databricks.com/blog/best-practices-cost-management-databricks
  7. Photon ~2× DBU premium on classic compute — "Databricks charges approximately 2× DBUs for Photon … the breakeven point is roughly a 2× speed improvement," so it only saves money when it makes the job at least 2× faster. B EYE — Photon guide · Databricks Community
  8. Nucleus Research, "Databricks ROI Case Study: Texas Rangers" (Mar 2024)375% ROI, 6-month payback, 4× cost-effectiveness vs the prior cloud data warehouse, 61% data-team productivity gain. nucleusresearch.com
  9. Q. Liu & Z. Yu, ACM Symposium on Cloud Computing, 2018 — "The Elasticity and Plasticity in Semi-Containerized Co-locating Cloud Workload": datacenter resource utilization is "very low, which wastes a huge amount of infrastructure investment and energy." doi.org/10.1145/3267809.3267830
  10. I. Matthew, IEEE ACDSA, 2026 — "Enhancing Cloud Sustainability by Optimizing Cloud Computing Through Right-Sizing and Autoscaling": right-sizing plus threshold-based autoscaling measurably reduce idle and overprovisioned resource use. doi.org/10.1109/ACDSA67686.2026.11467824
Intent Solutions · databricks-pack Design review #790