Datadog Research Report ยท 2026

The AI Cloud Cost Visibility Gap

94% of organizations say AI is driving cloud costs higher โ€” only 6% have real-time visibility to see it happen. New research on where the gap is widest, and what it's costing engineering teams.
๐Ÿ“Š 108 Engineering & FinOps Leaders
๐ŸŒŽ North America ยท Europe ยท APAC
๐Ÿ“… April 2026

94% of organizations say AI workloads have meaningfully increased their cloud spend in the past 12 months โ€” and for the majority, AI is now one of their largest cost drivers. Yet only 6% have real-time visibility into those costs, and nearly half can't attribute AI spend to the teams or projects driving it.

The Headline Numbers
AI is reshaping cloud economics faster than visibility can keep up.
94%
say AI workloads have meaningfully increased their cloud spend over the past 12 months
6%
have real-time or near real-time visibility into cloud costs as they're incurred
45%
explicitly acknowledge significant blind spots in attributing AI-related cloud costs to specific teams, projects, or individuals
1
The New Cost Center

AI has become a primary cloud cost driver โ€” across complex, multi-vendor footprints.

For most organizations, AI is no longer a budget rounding error. It's a top-line cost line โ€” and it's running on infrastructure that already spans clouds, on-prem environments, and a growing list of model providers. That sprawl is what makes it so hard to see clearly.

AI isn't theoretical for this audience. 98% of respondents are running AI/LLM workloads in production or staging today โ€” and 82% have AI in production right now. Combined with the share of teams reporting that AI has materially increased their cloud spend, AI has effectively become a permanent line item.

"Is your organization currently running or experimenting with AI/LLM workloads in production or staging environments?"
AI is in production for 8 in 10 organizations
"How have AI and LLM workloads impacted your organization's overall cloud spending over the past 12 months?"
AI is now one of the largest cost drivers for most teams

That spend doesn't sit in a single cloud account. Only 19% of organizations run on a single public cloud โ€” more than 80% operate across hybrid (46%) or multi-cloud (34%) footprints. And on top of that, most teams pull from multiple AI model providers, with an average of 2.5 vendors per respondent.

"Which best describes your organization's cloud infrastructure?"
Most teams operate across hybrid or multi-cloud environments
Multi-select
"Which AI/LLM providers and infrastructure does your organization use? (select all that apply)"
Multi-provider sprawl is the norm, not the exception
Respondents could select multiple options โ€” totals exceed 100% (avg. 2.5 providers per organization)

Coming back to the broader pattern: when respondents were asked in their own words whether governance and cost controls have kept pace with AI adoption, the answer was nearly unanimous: they haven't. 99% described some kind of gap โ€” most often partial ("catching up," "still maturing"), with 10% explicitly calling it significant.

"Have your governance and cost controls kept pace with AI adoption, or is there a gap?"
99% of organizations describe a governance gap
Open-ended question โ€” responses classified by theme. 1 respondent stated no gap; 96 described a partial/maturing gap; 11 used strong language ("significant," "shadow AI," "playing catch-up").
Sample responses describing the gap
In their own words

"Shadow AI usage by teams without oversight."

Insurance ยท Single-cloud

"AI tools are introduced by individual teams faster than policies can be updated."

Insurance ยท Director, FinOps

"We're playing catch-up; no AI-specific cost policies."

Tech ยท CIO/CTO

Regional Lens

AI cost pressure isn't evenly distributed across regions

North America is most exposed to AI's cost growth: 72% of NA orgs call AI a top cost driver, vs 56% in Europe and 42% in APAC. Europe runs the heaviest hybrid footprint, reflecting on-prem and data sovereignty pressures.

"How have AI workloads impacted your cloud spending?" โ€” broken down by region
North America is the most exposed to AI cost growth
"Which best describes your cloud infrastructure?" โ€” broken down by region
Europe runs the heaviest hybrid footprint
Why It Matters

The AI cost surface area has multiplied โ€” but cost monitoring hasn't.

When 57% of organizations name AI as one of their largest cost drivers, every cloud account, every Kubernetes cluster, and every model API becomes a separate billing surface to track. Combine the more than 80% running outside a single cloud with 2-3 model providers per team, and the result is a cost picture that's genuinely impossible to see from any single dashboard. In open-ended responses, 99% of respondents described some kind of governance gap โ€” most often partial ("catching up," "still maturing"), with 10% using strong language like "significant gap," "shadow AI," or "playing catch-up."

"
A stolen Gemini API key caused $82,000 in unauthorized charges in 48 hours, and we were held fully liable. This forced us to freeze all non-critical AI projects.
CIO / CTO / VP Engineering ยท Technology & SaaS
"
There is a significant gap between the speed of AI adoption and the maturity of governance and cost controls. In practice, this gap shows up as shadow AI spending, where teams subscribe to tools or use API credits without central oversight, leading to surprise costs and compliance risks.
CIO / CTO / VP Engineering ยท Technology & SaaS ยท Multi-cloud
2
The Infrastructure Divide

Hybrid, multi-cloud, and single-cloud teams each face a different version of the cost problem.

That cost surface area looks very different depending on how an organization's cloud footprint is structured. Cutting the data by infrastructure type reveals three distinct cost stories โ€” and three different first moves. Each segment has a clear weakest link, often counterintuitive to its complexity.

Start with cost pressure. Multi-cloud teams feel AI's cost impact most acutely โ€” 65% call AI a top cost driver, vs 52% of hybrid teams. The cost surface area scales with the number of providers being orchestrated.

"How have AI workloads impacted your cloud spending?" โ€” by infrastructure type
Multi-cloud teams feel the sharpest AI cost pressure
Base: Hybrid n=50, Multi-cloud n=37, Single cloud n=21
"How long does it take to see your cloud costs?" โ€” by infrastructure type
Hybrid teams have a clear visibility advantage
Base: Hybrid n=50, Multi-cloud n=37, Single cloud n=21

And yet โ€” single-cloud teams, despite the simpler footprint, have the worst visibility lag. 76% wait 4+ days to see costs, vs only 36% of hybrid teams. Counter-intuitive, but consistent: hybrid teams have invested in cross-environment cost tooling because they had to. Multi-cloud teams have the worst Kubernetes allocation gap, while hybrid teams have the worst attribution blind spots.

"How effectively can you attribute AI costs?" โ€” by infrastructure type
Multi-cloud teams have the largest attribution blind spots
Base: Hybrid n=50, Multi-cloud n=37, Single cloud n=21
"Who primarily owns cloud cost management?" โ€” by infrastructure type
Single-cloud teams are most likely to be finance-led
Base: Hybrid n=50, Multi-cloud n=37, Single cloud n=21
The Pattern

Each infrastructure type has a different weakest link โ€” and a different first move.

Hybrid teams have the best visibility (8% real-time, 56% within 1โ€“3 days) but split governance: 36% Finance-owned, the highest of any group. Their priority is consolidating ownership and closing attribution gaps. Multi-cloud teams have the sharpest cost pressure and the worst attribution (51% with blind spots) โ€” their priority is unifying cost data across providers. Single-cloud teams have the slowest visibility cycle and the highest finance-led ownership rate; their priority is replacing native billing tools with engineering-grade observability of cost.

"
There is a moderate gap. Governance and cost controls are still catching up to the speed of AI adoption โ€” especially with dynamic GPU instance costs and self-hosted models. This often results in unexpected monthly overruns and difficulty attributing spend to specific teams in real time.
CIO / CTO / VP Engineering ยท Technology & SaaS ยท Multi-cloud
"
Cost control is clearly lagging behind. Last month, we tried out three AI platforms, and the department paid for all of them on its own. Since there wasn't a unified procurement process, we ended up going over budget by quite a bit.
CIO / CTO / VP Engineering ยท Healthcare ยท Single-cloud
3
The Visibility Lag

Cost visibility, prediction, and ROI all run behind the workloads driving them.

When a runaway GPU job or a buggy LLM integration starts burning money, the cost shows up days later โ€” long after the damage is done. That lag flows downstream into shaky forecasts, partial Kubernetes allocation, and ROI conversations that lack the data to land.

Start at the source: only 6% of organizations get cost data in real-time. The majority wait days โ€” and roughly 1 in 15 wait at least a week. When the source data lags by days, monthly forecasting becomes guesswork: nearly half of teams admit their AI cost predictions are routinely off by 10โ€“30% or more.

"How long does it typically take from when cloud costs are incurred to when your team has visibility into them?"
More than half of teams wait 4+ days to see their costs
"How confident is your team in predicting AI-related cloud costs month over month?"
Nearly half can't forecast AI costs within 10% accuracy

The cumulative effect lands at the bill. 94% of respondents described at least one unexpected bill surprise or budget overrun tied to AI or cloud workloads in the past year โ€” clustering around a small set of recurring causes: experimentation that scaled faster than budget, GPU instances left running, autoscaling overshoots, and underestimated forecasts.

"Has your organization experienced unexpected bill surprises or budget overruns tied to AI or cloud workloads?"
94% of organizations have experienced an AI or cloud bill surprise
Open-ended question โ€” responses classified by theme. 101 described an incident, 6 explicitly said no, 1 unclear.
Sample incidents described
In their own words

"A subtle bug caused the system to re-prompt the AI model repeatedly... within two weeks of launch, the monthly cloud bill spiked by over $40,000."

Tech ยท CIO/CTO ยท Multi-cloud

"A GPU instance left running over a weekend caused a 40% monthly overrun."

Tech ยท Director, SRE/Platform/FinOps

"It wasn't until the end of the month that we realized the cost of AI computing power had exceeded the budget."

Tech ยท DevOps Engineer

The damage doesn't stop at the bill. Lagging visibility cascades into partial Kubernetes allocation (38% of K8s users have gaps globally โ€” sharply higher in Europe; see Regional Lens) and into weakened ROI conversations with leadership: while 53% of teams can confidently justify AI's business value with clear metrics, the other 47% lack precise data โ€” most stuck making "a reasonable case" rather than a data-backed one when leadership reviews budgets.

"How well can your organization allocate Kubernetes costs at a granular level (e.g., by namespace, service, or team)?"
38% of Kubernetes users have cost allocation gaps
Base: 101 Kubernetes-using respondents (excludes 7 non-Kubernetes respondents)
"How would you describe your organization's ability to justify the business value of AI spend to leadership?"
Nearly half can't fully justify AI's business value with precise data
Regional Lens

The visibility gap is far wider in Europe and APAC than in North America

Real-time cost visibility is almost exclusively a North American capability โ€” 15% of NA teams have it vs 3% in Europe and 0% in APAC. And counterintuitively, Europe has the largest Kubernetes allocation gap of any region.

"How long does it take to see your cloud costs?" โ€” broken down by region
Real-time visibility is almost exclusively a North American capability
"How well can you allocate Kubernetes costs at a granular level?" โ€” broken down by region
Europe's K8s allocation gap is nearly 3ร— North America's
Base: 101 K8s users only ยท NA n=36, EU n=32, APAC n=33
The Real Cost of Latency

Every day of cost lag is a day you can't course-correct.

Only 6% of organizations get cost data in real-time. The other 94% see it days later โ€” and for roughly 1 in 15, a week or more. That's the window when overruns compound: runaway weekend GPU jobs, idle resources left forgotten, buggy LLM integrations looping all night โ€” most often discovered only when finance flags the bill. In open-ended responses, 94% of respondents described at least one unexpected bill surprise or budget overrun tied to AI or cloud workloads. The cascade flows downstream: shaky forecasts (48% off by 10โ€“30% or more), patchy K8s allocation (38% globally, much higher in Europe), and ROI conversations where 47% lack precise data to back up AI's business value.

"
Unanticipated GPU training costs led to a quarterly budget freeze, halting new hires in our data science team.
CIO / CTO / VP Engineering ยท Fintech ยท Hybrid
"
When we were conducting batch processing in the cloud, we originally thought the cost would be relatively stable. However, the actual usage volume was higher than estimated, and the bill directly exceeded the budget. As a result, the management has begun to review the resource applications of each project more strictly.
CIO / CTO / VP Engineering ยท Gaming/Media ยท Multi-cloud
4
The Attribution Problem

Attribution gaps, split ownership, and tool sprawl are blocking real cost control.

Even when teams can see the bill, they often can't tell which team, project, or workload drove it โ€” and they can't always agree on whose job it is to act on it. Cost data is spread across multiple platforms, ownership is split between engineering and finance, and automation is uneven.

Start with the core problem: 45% of organizations explicitly acknowledge significant blind spots in their ability to attribute AI costs to specific teams or projects. (The remaining 55% rate themselves "very effective," though the survey didn't measure smaller blind spots โ€” see chart note.) And the responsibility for fixing it is divided โ€” 47% put cost ownership on engineering leadership, 30% on finance, and the rest spread across FinOps teams or distributed roles. When two functions own a problem, in practice, neither one fully does.

"How effectively can your organization attribute AI-related cloud costs to specific teams, projects, or individuals?"
45% explicitly acknowledge significant attribution blind spots
Survey offered two response options for this question (no "not effectively" option); 100% of respondents fell into one of the two shown
"Who primarily owns cloud cost management at your organization?"
Cost ownership is split โ€” most often between engineering and finance

That divided ownership is mirrored in divided tooling. Most teams stitch together 2 or more cost-tracking tools โ€” native cloud billing, third-party FinOps, observability platforms, internal scripts, spreadsheets โ€” and 12% have no consistent way to track AI-specific costs at all. Where automation does exist, it's strongest on optimization (rightsizing 66%, auto-scaling 64%) and weakest on cleanup and reactive controls โ€” only 23% automate idle resource cleanup (the largest gap), and 47% lack automated alerting on spend anomalies โ€” the safety net that would catch the next overrun in real time.

Multi-select
"Which tools does your organization use to track and manage cloud costs? (select all that apply)"
Most teams stitch together multiple cost tools
Respondents could select multiple options โ€” totals exceed 100% (most orgs use 2+ tools)
Multi-select
"Which automated cost optimization capabilities are in place at your organization? (select all that apply)"
Most teams have optimization automation โ€” but cleanup and alerting lag
Respondents could select multiple options โ€” totals exceed 100%. Only 23% have automated idle resource cleanup; 47% lack automated spend anomaly alerting.
Regional Lens

Cost ownership flips dramatically by region

North America is overwhelmingly engineering-led โ€” 54% Eng leadership + 28% FinOps under engineering = 82% engineering-aligned. APAC inverts the pattern: finance owns cost management at 46% of organizations, and 12% have no clear owner at all. Europe sits in between, with the most balanced split.

"Who primarily owns cloud cost management at your organization?" โ€” broken down by region
North America puts cost on engineering. APAC puts it on finance.
The Bottom Line

You can't optimize what you can't attribute โ€” and you can't attribute across siloed tools and split owners.

When 64% of teams use a third-party FinOps platform, 54% use native cloud billing tools, 41% use observability cost features, and 19% are still in spreadsheets, the data is in too many places to act on. Add split ownership between engineering (47%) and finance (30%), and accountability becomes a coordination problem on top of a data problem. That fragmentation is why bill shock keeps happening โ€” and why even basic safety nets like automated idle resource cleanup (only 23% have it) and automated spend anomaly alerting (47% lack it) are missing for huge swaths of teams.

"
Governance and cost controls are still catching up to the speed of AI adoption โ€” especially with dynamic GPU instance costs and self-hosted models. This often results in unexpected monthly overruns and difficulty attributing spend to specific teams in real time.
CIO / CTO / VP Engineering ยท Technology & SaaS
"
AI tools are introduced by individual teams faster than policies can be updated. This leads to inconsistent usage rules, unclear data privacy practices, and limited visibility into who is using which AI services and how much they're costing.
Director, SRE / Platform Engineering / FinOps ยท Insurance
In Their Own Words

What teams want from the next generation of cost platforms

We asked respondents an open-ended question: "If you could design the ideal platform for managing cloud and AI spend, what would be non-negotiable? What's missing today?" Three themes came up over and over.

"
The ideal platform must provide real-time cost prediction, automated guardrails that prevent overspend, and unified visibility across both cloud providers and AI vendors. What's missing today is simple: most tools can monitor spend, but they can't reliably predict, control, or govern it at the pace AI workloads grow.
CIO / CTO / VP Engineering ยท Technology & SaaS
"
Real-time, job-level cost tracking tied to individual AI workloads is non-negotiable, yet most tools only offer delayed, aggregated cloud spend visibility.
CIO / CTO / VP Engineering ยท Fintech
"
The problem with many tools nowadays is that data is too scattered, and I have to switch back and forth between cloud platforms, AI platforms, and financial systems.
DevOps Engineer ยท Fintech
The Datadog Difference

Close the visibility gap with unified cloud & AI cost monitoring.

Datadog Cloud Cost Management brings cloud, Kubernetes, and AI spend into the same platform your engineers already use to monitor performance โ€” so cost is no longer a separate, lagging dashboard, but a real-time signal alongside the workloads driving it.

โšก

Real-time cost visibility

See cloud, Kubernetes, and AI spend in near real time โ€” not days later when the bill arrives.

๐ŸŽฏ

Granular attribution

Tie spend to specific teams, services, models, and workloads with the same tags you already use for observability.

๐Ÿ””

Anomaly detection & alerts

Catch runaway GPU jobs, prompt loops, and forgotten resources the moment they appear โ€” before the bill shock hits.

Explore Datadog Cloud Cost Management โ†’

Methodology

Findings are based on a survey of engineering, platform, and FinOps leaders at organizations actively running or planning AI/LLM workloads.

108
Total respondents (all completed responses)
98%
Running or experimenting with AI/LLM workloads
94%
Use Kubernetes in production
Apr 2026
Survey fielded April 2026
"Which of the following best describes your primary role?"
Respondent roles
"What industry does your organization operate in?"
Industries represented
"What is your organization's primary geography?"
Geographic distribution
"How many employees does your organization have?"
Organization size by employee count
"Approximately how many software/infrastructure engineers are at your organization?"
Engineering org size
"Does your organization use Kubernetes in production?"
Kubernetes adoption
A note on percentages. All percentages are calculated against the total respondent base of 108 unless otherwise noted. The Kubernetes cost allocation charts (main and regional) use a filtered base of 101 respondents who run Kubernetes in production, since the question only applies to them. Multi-select questions (AI providers used, cost tracking tools, automation capabilities) reflect the share of respondents who selected each option, so totals exceed 100%. Single-select questions sum to 100% (with rounding handled via Largest Remainder method). Regional cross-tabs shown in the "Regional Lens" callouts are calculated as the share of respondents within each region โ€” North America (n=39), Europe (n=36), APAC (n=33). Infrastructure cross-tabs (Takeaway 4) use Hybrid (n=50), Multi-cloud (n=37), and Single-cloud (n=21) bases. Two open-ended questions in the survey โ€” about governance gaps and bill surprises โ€” were analyzed thematically; reported percentages from those questions reflect the share of respondents whose responses contained the relevant theme. The survey for AI cost attribution and Kubernetes cost allocation offered two response options each rather than a wider scale; this is disclosed on those individual charts. All findings reflect self-reported survey data; this report does not include external benchmarks.
ยฉ 2026 Datadog Research ยท The AI Cloud Cost Visibility Gap
0