94% of organizations say AI workloads have meaningfully increased their cloud spend in the past 12 months, and for most, AI now ranks among their largest cost drivers. Only 6% have real-time visibility into those costs, and nearly half can't attribute AI spend to the teams or projects driving it.
For most organizations, AI now sits among the top categories of cloud spend. The infrastructure powering it spans clouds, on-prem environments, and a growing list of model providers, which makes the spend especially difficult to track.
98% of respondents are running AI/LLM workloads in production or staging today, with 82% in production right now. Combine that with the share of teams reporting that AI has materially increased their cloud spend, and AI now functions as a permanent line item.
That spend rarely sits in a single cloud account. Only 19% of organizations run on a single public cloud, while more than 80% operate across hybrid (46%) or multi-cloud (34%) footprints. Most teams also pull from multiple AI model providers, averaging 2.5 vendors per respondent.
When respondents described in their own words whether governance and cost controls have kept pace with AI adoption, the answer was nearly unanimous. 92% described some kind of gap, with 69% framing it as significant (using language like "lag," "outpaced," or "haven't kept pace") and another 23% calling it more partial ("catching up," "still maturing").
"Shadow AI usage by teams without oversight."
Insurance ยท Single-cloud
"AI tools are introduced by individual teams faster than policies can be updated."
Insurance ยท Director, FinOps
"We're playing catch-up; no AI-specific cost policies."
Tech ยท CIO/CTO
North America is most exposed to AI's cost growth: 72% of NA orgs call AI a top cost driver, compared to 56% in Europe and 42% in APAC. Europe runs the heaviest hybrid footprint, reflecting on-prem and data sovereignty pressures.
When 57% of organizations name AI as one of their largest cost drivers, every cloud account, Kubernetes cluster, and model API becomes a separate billing surface to track. Combining the more than 80% running outside a single cloud with 2-3 model providers per team produces a cost picture that no single dashboard can capture. In open-ended responses, 92% of respondents described some kind of governance gap, with 69% using strong language like "lag," "outpaced," or "haven't kept pace" and another 23% framing it as a partial gap of "catching up" or "still maturing."
The cost surface area looks very different depending on how an organization's cloud footprint is structured. Cutting the data by infrastructure type reveals three distinct cost stories, each pointing to a different first move. Each segment has a clear weakest link, and that weakest link doesn't always match the apparent complexity of the setup.
Multi-cloud teams feel AI's cost impact most acutely. 65% call AI a top cost driver, compared to 52% of hybrid teams. Cost pressure rises with the number of providers a team has to orchestrate.
Despite the simpler footprint, single-cloud teams have the worst visibility lag. 76% wait 4+ days to see costs, compared to only 36% of hybrid teams. The pattern holds because hybrid teams have invested in cross-environment cost tooling out of necessity. Multi-cloud teams have the worst Kubernetes allocation gap, while hybrid teams have the worst attribution blind spots.
Hybrid teams have the best visibility (8% real-time, 56% within 1-3 days) but split governance: 36% finance-owned, the highest of any group. Their priority is consolidating ownership and closing attribution gaps. Multi-cloud teams have the sharpest cost pressure and the worst attribution (51% with blind spots), so their priority is unifying cost data across providers. Single-cloud teams have the slowest visibility cycle and the highest finance-led ownership rate; their priority is replacing native billing tools with engineering-grade cost observability.
When a runaway GPU job or a buggy LLM integration starts burning money, the cost shows up days later, long after the damage is done. That lag flows downstream into shaky forecasts, partial Kubernetes allocation, and ROI conversations that lack supporting data.
Only 6% of organizations get cost data in real-time. The majority wait days, and roughly 1 in 15 wait at least a week. When the source data lags by days, monthly forecasting becomes guesswork. Nearly half of teams admit their AI cost predictions are routinely off by 10โ30% or more.
The bill is where this all lands. 91% of respondents described at least one unexpected bill surprise or budget overrun tied to AI or cloud workloads in the past year. The causes cluster around a few recurring patterns: experimentation that scaled faster than budget, GPU instances left running, autoscaling overshoots, and underestimated forecasts.
"A subtle bug caused the system to re-prompt the AI model repeatedly... within two weeks of launch, the monthly cloud bill spiked by over $40,000."
Tech ยท CIO/CTO ยท Multi-cloud
"A GPU instance left running over a weekend caused a 40% monthly overrun."
Tech ยท Director, SRE/Platform/FinOps
"It wasn't until the end of the month that we realized the cost of AI computing power had exceeded the budget."
Tech ยท DevOps Engineer
The bill is only one part of the impact. Lagging visibility also cascades into partial Kubernetes allocation (38% of K8s users have gaps globally, sharply higher in Europe; see Regional Lens) and into weakened ROI conversations with leadership. 53% of teams can confidently justify AI's business value with clear metrics, while the other 47% lack precise data, leaving most of them making "a reasonable case" rather than a data-backed one when leadership reviews budgets.
Real-time cost visibility is almost exclusively a North American capability. 15% of NA teams have it, compared to 3% in Europe and 0% in APAC. Europe also has the largest Kubernetes allocation gap of any region.
Only 6% of organizations get cost data in real-time. The other 94% see it days later, and for roughly 1 in 15, a week or more. That window is when overruns compound: runaway weekend GPU jobs, idle resources left forgotten, and buggy LLM integrations looping all night, most often discovered only when finance flags the bill. In open-ended responses, 91% of respondents described at least one unexpected bill surprise or budget overrun tied to AI or cloud workloads. The downstream effects show up as shaky forecasts (48% off by 10โ30% or more), patchy K8s allocation (38% globally, much higher in Europe), and ROI conversations where 47% lack precise data to back up AI's business value.
Even when teams can see the bill, they often can't tell which team, project, or workload drove it. They also can't always agree on whose job it is to act on it. Cost data is spread across multiple platforms, ownership is split between engineering and finance, and automation is uneven.
45% of organizations explicitly acknowledge significant blind spots in their ability to attribute AI costs to specific teams or projects. (The remaining 55% rate themselves "very effective," though the survey didn't measure smaller blind spots; see chart note.) Responsibility for fixing it is divided across functions. 47% put cost ownership on engineering leadership, 30% on finance, and the rest spread across FinOps teams or distributed roles. When two functions share ownership of a problem, accountability tends to fall through the cracks.
Divided ownership is mirrored in divided tooling. Most teams stitch together 2 or more cost-tracking tools across native cloud billing, third-party FinOps, observability platforms, internal scripts, and spreadsheets. 12% have no consistent way to track AI-specific costs at all. Where automation does exist, it leans toward optimization (rightsizing 66%, auto-scaling 64%) and away from cleanup and reactive controls. Only 23% automate idle resource cleanup (the largest gap), and 47% lack automated alerting on spend anomalies, the safety net that would catch the next overrun before it grows.
North America is overwhelmingly engineering-led: 54% engineering leadership plus 28% FinOps under engineering totals 82% engineering-aligned. APAC shows the opposite pattern, with finance owning cost management at 46% of organizations, and 12% reporting no clear owner at all. Europe sits in between, with the most balanced split.
When 64% of teams use a third-party FinOps platform, 54% use native cloud billing tools, 41% use observability cost features, and 19% are still in spreadsheets, cost data lives in too many places to act on. Layer split ownership between engineering (47%) and finance (30%) on top of that, and accountability becomes a coordination problem as well as a data problem. The fragmentation explains why bill shock keeps happening, and why basic safety nets like automated idle resource cleanup (only 23% have it) and automated spend anomaly alerting (47% lack it) are missing for so many teams.
We asked respondents an open-ended question: "If you could design the ideal platform for managing cloud and AI spend, what would be non-negotiable? What's missing today?" Three themes came up repeatedly.
Datadog Cloud Cost Management brings cloud, Kubernetes, and AI spend into the same platform your engineers already use to monitor performance. Instead of a separate, lagging dashboard, cost becomes a real-time signal alongside the workloads driving it.
See cloud, Kubernetes, and AI spend in near real time, instead of days later when the bill arrives.
Tie spend to specific teams, services, models, and workloads with the same tags you already use for observability.
Catch runaway GPU jobs, prompt loops, and forgotten resources the moment they appear, well before bill shock hits.
Findings are based on a survey of engineering, platform, and FinOps leaders at organizations actively running or planning AI/LLM workloads.