Datadog Research // April–May 2026
The State of AI in Observability
Datadog Research Report · 2026

The State of AI in Observability.

How 121 senior engineering leaders are deploying AI in observability today, where it is delivering value, what is holding back broader autonomy, and where the next year of investment is going.

Sample
121 senior leaders
Threshold
500+ employees
Geography
US, UK, Australia
Format
Conversational survey
At a glance

AI has moved from experiment to production in observability. 79% of senior engineering leaders say AI is now a significant or mature part of their stack, 87% plan to increase investment over the next 12 to 18 months, and root cause analysis is where teams say AI is delivering the biggest impact today. This report looks at where AI runs in observability environments, what value it is delivering, what is holding back broader autonomy, and where leaders are putting their next dollar.

79%
Say AI is a significant or mature part of their observability strategy today
87%
Plan to increase AI observability investment over the next 12 to 18 months
58%
Are putting their next dollar into data quality, pipeline, or telemetry work
About the respondents

Senior leaders with budget — and an active buying habit.

Every respondent is a senior engineering leader at a 500+ employee company who owns a budget or directly influences purchasing for engineering tooling. This is also an active buying audience: 93% are running three or more vendor evaluation processes per year, and 28% are running five or more.

121
Senior engineering leaders surveyed across US, UK, and Australia
9
Distinct senior roles, from CTO and VP of Engineering to AI Platform Lead
93%
Are running three or more vendor evaluations for engineering tooling per year
Screener · Multi-select
Budget areas these leaders own or influence
Multi-select · n=121 · respondents own or influence multiple budget areas, so values exceed 100%
Q7 · Survey question
Is AI adoption happening as part of a coordinated org-wide strategy, or is it emerging team by team?
AI adoption is coordinated at the top but mixed in execution. Only 16% of teams report pure bottom-up adoption.
Single-select · n=116 valid responses · values sum to 100%
01 · Finding

AI is already in production across observability stacks.

AI in observability has crossed from experiment into production. 79% of senior engineering leaders say AI is now a significant or mature part of their observability strategy, and adoption is no longer concentrated in a single workflow. Teams report applying AI across log summarization, anomaly detection, capacity planning, and root cause analysis. The value they describe is concrete and consistent: faster resolution and earlier detection. Adoption maturity is not evenly distributed. UK and Australian teams report higher rates of mature deployment than their US peers.

Q5 · Survey question
How far along is your organization in adopting AI for observability?
AI has crossed from experiment to production for 79% of leaders.
Single-select · n=114 valid responses · values sum to 100%
Q5 · Cross-tab by region
How far along is your organization in adopting AI for observability?
UK and Australian teams report higher AI maturity than US peers.
Each row sums to 100% · US n=59 · UK n=36 · Australia n=19 · directional
Q1 · Survey question
Which observability or engineering workflows are you currently using AI for?
Adoption is broad: log summarization (74%) and anomaly detection (67%) lead the list.
Multi-select · n=118 · respondents could choose multiple workflows, so values exceed 100%
Q2 · Most impactful workflow
Of the workflows you mentioned, which one do you think is most impacted by your use of AI today — and why?
Most impactful workflow named by respondents.
Open-ended responses classified into workflow categories · n=109 classified · values sum to 100%
Q2 · "And why" portion of the same response
Of the workflows you mentioned, which one do you think is most impacted by your use of AI today — and why?
Why teams say that workflow has the biggest impact.
Open-ended responses classified by reason · n=106 substantive responses · respondents could surface multiple reasons, so totals exceed 100%
Key Insight

Across 8 distinct observability workflows, leaders report using AI for log summarization (74%), anomaly detection (67%), and capacity planning (63%) at the highest rates. When asked which workflow AI most impacts today, 30% point to root cause analysis. The reasons cluster tightly: 40% credit AI with cutting time to resolution, and 36% credit it with catching issues earlier or more accurately. AI in observability has stopped being a question of whether. Buyers are now asking how far to take it.

"Log summarization/pattern recognition. Why? Because we usually used to manually sift through thousands or millions of log lines. AI can now condense into a clear, human readable summary in seconds."

— Senior engineering leader, 500-999 employees

"Root cause analysis, because AI helps quickly connect signals from logs, metrics, and traces, which reduces the time it takes to pinpoint what actually caused an issue."

— Director of Engineering, 2,500-4,999 employees
02 · Finding

But teams still can't trust it to act autonomously.

The ceiling on AI value is autonomy. Only 18% of leaders are comfortable letting AI operate autonomously across production workflows. The other 82% want a human in the loop in some form. The regional split is sharp: just 9% of US leaders are comfortable with full autonomy, compared with 29% in the UK and 25% in Australia.

Q11 · Survey question
What level of autonomy are you comfortable giving AI in your production environment today?
Only 18% of leaders are comfortable with full AI autonomy in production.
Single-select · n=105 valid responses · values sum to 100%
Q11 · Cross-tab by region
What level of autonomy are you comfortable giving AI in your production environment today?
Comfort with AI autonomy varies sharply by region.
Each row sums to 100% · US n=55 · UK n=34 · Australia n=16 · directional
Q13 · Pick top 2
What are the most important prerequisites for trusting AI more broadly in your engineering workflows?
What would unlock trust: proof, explainability, consistency.
Multi-select (pick top 2) · n=104 valid respondents · totals exceed 100% by design
Key Insight

Teams have proven AI works on individual workflows. The next dollar of ROI sits behind a trust threshold that 82% of leaders haven't crossed. The unlock is specific: a proven track record of accurate outputs (62%), explainable and auditable decision-making (51%), and consistent performance over time (44%). Only 21% picked visibility into model internals. Buyers want evidence that AI works, not a window into the model. UK leaders weight regulatory oversight at 44%, nearly double the US (26%).

Data quality alone won't do it: even among leaders who say their monitoring is strong enough for AI to perform well, only 48% are comfortable with high autonomy. Better data is necessary, but the trust prerequisites above also need to be met. The next section shows why the data foundation is the harder of the two requirements to satisfy today.

"The biggest gap is trust at scale. Today AI is helpful for insights and simple actions, but it is not consistent enough to fully rely on for more complex decisions across all systems. Over the next year or two we need it to be more accurate, context aware and dependable, so we can safely expand from suggestion to action."

— CTO, 500-999 employees

"AI tooling mostly reacts to problems but it needs to be proactive and self operating, going from a monitoring assistant to an autonomous operator."

— CTO, 500-999 employees
03 · Finding

The constraint sits in the data layer, not just the model.

When asked directly whether their monitoring quality limits AI performance, only 42% of leaders say their monitoring is strong enough, and 23% explicitly say monitoring gaps significantly limit AI today. The same answer shows up across the open-ended questions. When leaders describe what is hardest, what is missing, and what is currently preventing their ideal AI capability from working perfectly, they return to the same handful of themes. Trust and accuracy are the symptom most named. When leaders describe the underlying cause, telemetry quality and fragmented data come up over and over.

Q10 · Survey question
What impact does your monitoring quality have on the performance of your AI tools — and when monitoring falls short, what breaks?
Only 42% say their monitoring is strong enough for AI to perform well.
Single-select · n=108 valid responses · values sum to 100%
Q10 · Cross-tab by region
What impact does your monitoring quality have on the performance of your AI tools — and when monitoring falls short, what breaks?
UK leaders are the most candid: only 28% say their monitoring is strong enough.
Each row sums to 100% · US n=54 · UK n=35 · Australia n=19 · directional
Q8 · Survey question
What has been the hardest part of getting AI to work reliably in your observability workflows?
Data quality (23%) is the most-named single hardest part of getting AI to work.
Single-select (pick the biggest) · n=110 valid responses · values sum to 100%
Q8 · Cross-tab by region
What has been the hardest part of getting AI to work reliably in your observability workflows?
The hardest part of getting AI to work reliably differs by region.
Single-select · top 5 of 8 categories shown · lower-incidence categories omitted, so regional values do not sum to 100% · US n=58 · UK n=35 · Australia n=17 · directional
Q21 + Q22 · Themes coded from two open-ended questions
Themes in what leaders said is missing from their AI tooling today
Q21: What's the biggest gap between what your AI tooling does for you today and what you need it to do over the next year or two?
Q22: If you could wave a wand and have one AI capability work perfectly tomorrow, what would it be — and what's currently preventing that from happening?
Trust (66%) is the dominant theme; data quality (31%) is the recurring foundational concern.
Open-ended responses on biggest gap and "wand capability" · n=101 substantive responses
Q22 · "What's preventing" portion
If you could wave a wand and have one AI capability work perfectly tomorrow, what would it be — and what's currently preventing that from happening?
What's preventing the AI capability leaders most want to work perfectly.
Open-ended responses to Q22 · n=100 substantive (out of 121: 20 left blank, 1 gave a one-word answer) · 79 of 100 named at least one blocker theme · respondents could name multiple blockers, so totals exceed 100%
Key Insight

The most direct evidence is the simplest: only 42% of leaders say their monitoring is strong enough for AI to perform well, and 23% explicitly say monitoring gaps significantly limit AI today. UK leaders are the most candid. Just 28% say their monitoring is strong enough, the lowest of any region. Open-ended responses converge on the same answer. When asked what is preventing their ideal AI from working perfectly, 53% named trust, accuracy, or reliability concerns, and 36% named data quality or fragmented telemetry. Across all four open-ended questions, 60% of respondents (73 of 121 unique people across Q8, Q19, Q21, and Q22) flagged data quality, telemetry, or monitoring as a constraint somewhere in their answers. The diagnosis varies by region: US leaders point to data quality as the single hardest part (28%), UK leaders split between trust and cost (20% each), and Australian respondents call out integration with legacy systems (29%). The model itself is rarely where leaders say AI breaks. The data feeding it is the constraint that decides how far AI can extend.

What the constraint costs in practice. Asked to walk through the most recent time AI fell short or monitoring gaps mattered, leaders described tangible, executive-level consequences: 70% named wasted engineer time, 52% named slower incident response and stretched MTTR, and 48% named direct customer-facing impact. The cost of the data-layer constraint is paid in engineer-hours and customer experience, not in abstract AI safety concerns.

"I'd want consistently accurate, explainable root cause analysis that works across all services and dependencies in real time, and what's preventing it is fragmented/low-quality telemetry and incomplete system context that makes it hard for AI to reliably correlate signals across the full stack."

— Director of Cloud / Reliability, 5,000+ employees

"I would choose full automated root caused remediation. The goal is for the system to not just identify the issues, but safely resolve it. Currently, fragmented legacy data and a lack of high confidence telemetry prevent this, as the risk of an automated false fix causing a larger outage is still too high."

— Head of Infrastructure, 5,000+ employees

"Perfect automated root-cause analysis. Data fragmentation and poor telemetry currently prevent the AI from connecting the dots without manual help."

— Director of Engineering, 1,000-2,499 employees
04 · Finding

Investment is flowing toward the data foundation.

Investment patterns confirm the diagnosis. 87% of leaders plan to increase observability AI investment over the next 12 to 18 months, and the destination of those dollars is unusually consistent. Data quality, pipeline, and telemetry work is the single largest category. The shape of platform teams want signals the same intent: hybrid AI that combines turnkey defaults with the ability to customize. Leaders are spending against the constraint they named.

Q17 · Survey question
How do you expect your organization's investment in AI for observability to change over the next 12–18 months?
87% of leaders are increasing observability AI investment over the next 12 to 18 months.
Single-select · n=103 valid responses · values sum to 100%
Q18 · Themes coded from open-ended responses
What are the key drivers behind that decision — what's making the case internally?
Efficiency pressure (60%) and executive interest (54%) are the dominant investment drivers.
Open-ended responses coded into themes by keyword matching · n=81 substantive responses · respondents could name multiple drivers, so totals exceed 100%
Q19 · Pick top 2
Where specifically are those investment dollars going?
Data quality, pipeline, and telemetry is the single largest investment category.
Multi-select (pick top 2) · n=99 valid respondents · totals exceed 100% by design
Q3 · Survey question
How is AI delivered in your observability environment today?
Two-thirds of teams already deliver AI native (35%) or in combination (31%) with the platform.
Single-select · n=116 valid responses · values sum to 100%
Q15 · Survey question
When it comes to AI in your observability stack, which approach do you prefer?
85% want hybrid (50%) or fully customizable (35%); only 9% want fully out-of-the-box.
Single-select · n=104 valid responses · values sum to 100%
Q15 · Cross-tab by region
When it comes to AI in your observability stack, which approach do you prefer?
Preferred AI approach inverts between the US and Australia.
Each row sums to 100% · US n=54 · UK n=34 · Australia n=16 · directional
Key Insight

Two signals point in the same direction. Where the dollars go: 58% of leaders are putting investment into data quality, pipeline, or telemetry work, the single largest category, ahead of buying new AI platforms (36%), building internal tooling (34%), or hiring AI engineers (15%). What shape they want: 85% prefer either a hybrid platform (50%) or full customization (35%). Only 9% want fully out-of-the-box AI, and two-thirds of teams already deliver AI either natively in their observability platform (35%) or in a combination of native and integrated approaches (31%). The platform-shape preference inverts between regions: 59% of US leaders want a hybrid approach, but 56% of Australian leaders want full customization. UK leaders split nearly evenly between the two. Leaders are spending on the foundation first, and they want platforms that adapt to their environment rather than dictate it.

"If I could wave a wand, the one AI capability would be perfect, real time root cause analysis with safe auto remediation. I feel currently the data quality is not good enough, current integration challenges, and trust and risk of automation errors are currently preventing it from happening."

— Director of Engineering, 2,500-4,999 employees

"The biggest 'ideal' capability is fully autonomous incident resolution, where AI detects, diagnoses, fixes, and verifies issues without human help for routine cases. What prevents it today: incomplete system visibility (logs/metrics are noisy or inconsistent), complex root causes…"

— Director of SRE, 1,000-2,499 employees
The implication

The foundation, the AI, and the platform.

The pattern across all 121 leaders points the same direction. AI is delivering measurable value where it runs, but autonomy is gated by trust, and the data layer is the constraint that decides how far AI can extend. Investment is flowing toward fixing the foundation, and teams want platforms that adapt to their environment rather than dictate it.

Datadog is built for the trust threshold leaders described. The platform runs on a unified telemetry foundation, the prerequisite for AI to perform consistently rather than flake, addressing the 44% of leaders who name consistent and predictable performance as a top trust prerequisite. AI built natively on that foundation produces auditable outputs grounded in the same data engineers already see, addressing the 51% who name explainable, auditable decision-making. By running AI on top of an observability platform already deployed across thousands of production environments, Datadog inherits the operating track record that 62% of leaders say is the top unlock for trusting AI more broadly.

When the foundation is solid, AI becomes safer to trust, easier to scale, and cheaper to run. That is the bet 87% of leaders are making with their next 12 months of investment.

See how Datadog AI works →
Methodology

How this research was conducted.

This research was conducted through a conversational AI-led survey designed to capture both structured responses and substantive open-ended commentary from senior engineering leaders.

121
Senior engineering leaders
500+
Employee company minimum
3
Countries (US, UK, Australia)
7 days
Field period (Apr 30 – May 6, 2026)

Respondent Roles

Region

Notes on calculation. Percentages are calculated against valid responses for each question. A response is valid if it matches an option by text, number ("3"), or letter ("Option C"). Multi-select chart totals exceed 100% by design. Pie and donut values use largest-remainder rounding to sum to exactly 100%. Open-ended responses were coded into themes by keyword matching, each respondent counted once per theme. The Q21 + Q22 themes chart uses n = 101 substantive responses (20 left both fields blank); 93 of the 101 matched at least one theme. The Q22 blockers chart uses n = 100 substantive Q22 responses. Quotes are lightly edited for capitalization and punctuation; substantive content is preserved as written. Attribution by role and company size only.
Datadog Research // April–May 2026
The State of AI in Observability
0