94% of organizations say AI workloads have meaningfully increased their cloud spend in the past 12 months โ and for the majority, AI is now one of their largest cost drivers. Yet only 6% have real-time visibility into those costs, and nearly half can't attribute AI spend to the teams or projects driving it.
For most organizations, AI is no longer a budget rounding error. It's a top-line cost line โ and it's running on infrastructure that already spans clouds, on-prem environments, and a growing list of model providers. That sprawl is what makes it so hard to see clearly.
AI isn't theoretical for this audience. 98% of respondents are running AI/LLM workloads in production or staging today โ and 82% have AI in production right now. Combined with the share of teams reporting that AI has materially increased their cloud spend, AI has effectively become a permanent line item.
That spend doesn't sit in a single cloud account. Only 19% of organizations run on a single public cloud โ more than 80% operate across hybrid (46%) or multi-cloud (34%) footprints. And on top of that, most teams pull from multiple AI model providers, with an average of 2.5 vendors per respondent.
Coming back to the broader pattern: when respondents were asked in their own words whether governance and cost controls have kept pace with AI adoption, the answer was nearly unanimous: they haven't. 99% described some kind of gap โ most often partial ("catching up," "still maturing"), with 10% explicitly calling it significant.
"Shadow AI usage by teams without oversight."
Insurance ยท Single-cloud
"AI tools are introduced by individual teams faster than policies can be updated."
Insurance ยท Director, FinOps
"We're playing catch-up; no AI-specific cost policies."
Tech ยท CIO/CTO
North America is most exposed to AI's cost growth: 72% of NA orgs call AI a top cost driver, vs 56% in Europe and 42% in APAC. Europe runs the heaviest hybrid footprint, reflecting on-prem and data sovereignty pressures.
When 57% of organizations name AI as one of their largest cost drivers, every cloud account, every Kubernetes cluster, and every model API becomes a separate billing surface to track. Combine the more than 80% running outside a single cloud with 2-3 model providers per team, and the result is a cost picture that's genuinely impossible to see from any single dashboard. In open-ended responses, 99% of respondents described some kind of governance gap โ most often partial ("catching up," "still maturing"), with 10% using strong language like "significant gap," "shadow AI," or "playing catch-up."
That cost surface area looks very different depending on how an organization's cloud footprint is structured. Cutting the data by infrastructure type reveals three distinct cost stories โ and three different first moves. Each segment has a clear weakest link, often counterintuitive to its complexity.
Start with cost pressure. Multi-cloud teams feel AI's cost impact most acutely โ 65% call AI a top cost driver, vs 52% of hybrid teams. The cost surface area scales with the number of providers being orchestrated.
And yet โ single-cloud teams, despite the simpler footprint, have the worst visibility lag. 76% wait 4+ days to see costs, vs only 36% of hybrid teams. Counter-intuitive, but consistent: hybrid teams have invested in cross-environment cost tooling because they had to. Multi-cloud teams have the worst Kubernetes allocation gap, while hybrid teams have the worst attribution blind spots.
Hybrid teams have the best visibility (8% real-time, 56% within 1โ3 days) but split governance: 36% Finance-owned, the highest of any group. Their priority is consolidating ownership and closing attribution gaps. Multi-cloud teams have the sharpest cost pressure and the worst attribution (51% with blind spots) โ their priority is unifying cost data across providers. Single-cloud teams have the slowest visibility cycle and the highest finance-led ownership rate; their priority is replacing native billing tools with engineering-grade observability of cost.
When a runaway GPU job or a buggy LLM integration starts burning money, the cost shows up days later โ long after the damage is done. That lag flows downstream into shaky forecasts, partial Kubernetes allocation, and ROI conversations that lack the data to land.
Start at the source: only 6% of organizations get cost data in real-time. The majority wait days โ and roughly 1 in 15 wait at least a week. When the source data lags by days, monthly forecasting becomes guesswork: nearly half of teams admit their AI cost predictions are routinely off by 10โ30% or more.
The cumulative effect lands at the bill. 94% of respondents described at least one unexpected bill surprise or budget overrun tied to AI or cloud workloads in the past year โ clustering around a small set of recurring causes: experimentation that scaled faster than budget, GPU instances left running, autoscaling overshoots, and underestimated forecasts.
"A subtle bug caused the system to re-prompt the AI model repeatedly... within two weeks of launch, the monthly cloud bill spiked by over $40,000."
Tech ยท CIO/CTO ยท Multi-cloud
"A GPU instance left running over a weekend caused a 40% monthly overrun."
Tech ยท Director, SRE/Platform/FinOps
"It wasn't until the end of the month that we realized the cost of AI computing power had exceeded the budget."
Tech ยท DevOps Engineer
The damage doesn't stop at the bill. Lagging visibility cascades into partial Kubernetes allocation (38% of K8s users have gaps globally โ sharply higher in Europe; see Regional Lens) and into weakened ROI conversations with leadership: while 53% of teams can confidently justify AI's business value with clear metrics, the other 47% lack precise data โ most stuck making "a reasonable case" rather than a data-backed one when leadership reviews budgets.
Real-time cost visibility is almost exclusively a North American capability โ 15% of NA teams have it vs 3% in Europe and 0% in APAC. And counterintuitively, Europe has the largest Kubernetes allocation gap of any region.
Only 6% of organizations get cost data in real-time. The other 94% see it days later โ and for roughly 1 in 15, a week or more. That's the window when overruns compound: runaway weekend GPU jobs, idle resources left forgotten, buggy LLM integrations looping all night โ most often discovered only when finance flags the bill. In open-ended responses, 94% of respondents described at least one unexpected bill surprise or budget overrun tied to AI or cloud workloads. The cascade flows downstream: shaky forecasts (48% off by 10โ30% or more), patchy K8s allocation (38% globally, much higher in Europe), and ROI conversations where 47% lack precise data to back up AI's business value.
Even when teams can see the bill, they often can't tell which team, project, or workload drove it โ and they can't always agree on whose job it is to act on it. Cost data is spread across multiple platforms, ownership is split between engineering and finance, and automation is uneven.
Start with the core problem: 45% of organizations explicitly acknowledge significant blind spots in their ability to attribute AI costs to specific teams or projects. (The remaining 55% rate themselves "very effective," though the survey didn't measure smaller blind spots โ see chart note.) And the responsibility for fixing it is divided โ 47% put cost ownership on engineering leadership, 30% on finance, and the rest spread across FinOps teams or distributed roles. When two functions own a problem, in practice, neither one fully does.
That divided ownership is mirrored in divided tooling. Most teams stitch together 2 or more cost-tracking tools โ native cloud billing, third-party FinOps, observability platforms, internal scripts, spreadsheets โ and 12% have no consistent way to track AI-specific costs at all. Where automation does exist, it's strongest on optimization (rightsizing 66%, auto-scaling 64%) and weakest on cleanup and reactive controls โ only 23% automate idle resource cleanup (the largest gap), and 47% lack automated alerting on spend anomalies โ the safety net that would catch the next overrun in real time.
North America is overwhelmingly engineering-led โ 54% Eng leadership + 28% FinOps under engineering = 82% engineering-aligned. APAC inverts the pattern: finance owns cost management at 46% of organizations, and 12% have no clear owner at all. Europe sits in between, with the most balanced split.
When 64% of teams use a third-party FinOps platform, 54% use native cloud billing tools, 41% use observability cost features, and 19% are still in spreadsheets, the data is in too many places to act on. Add split ownership between engineering (47%) and finance (30%), and accountability becomes a coordination problem on top of a data problem. That fragmentation is why bill shock keeps happening โ and why even basic safety nets like automated idle resource cleanup (only 23% have it) and automated spend anomaly alerting (47% lack it) are missing for huge swaths of teams.
We asked respondents an open-ended question: "If you could design the ideal platform for managing cloud and AI spend, what would be non-negotiable? What's missing today?" Three themes came up over and over.
Datadog Cloud Cost Management brings cloud, Kubernetes, and AI spend into the same platform your engineers already use to monitor performance โ so cost is no longer a separate, lagging dashboard, but a real-time signal alongside the workloads driving it.
See cloud, Kubernetes, and AI spend in near real time โ not days later when the bill arrives.
Tie spend to specific teams, services, models, and workloads with the same tags you already use for observability.
Catch runaway GPU jobs, prompt loops, and forgotten resources the moment they appear โ before the bill shock hits.
Findings are based on a survey of engineering, platform, and FinOps leaders at organizations actively running or planning AI/LLM workloads.