A new study of 100 senior security and infrastructure leaders finds that AI has crossed firmly into production, but the data underneath isn't where it needs to be. Investigation, not detection, is now the bottleneck. Internal AI risk has caught up to external attackers as a top concern. And nearly half of teams are still running security and engineering on completely separate stacks.
Across 100 senior security and infrastructure leaders at companies of 2,500+ employees, AI has crossed firmly into production. Forty-eight percent are actively deploying it in their security operations today, and 83% are using or planning to use agentic AI in some form. But the same teams describe a structural problem holding back what comes next. Forty-two percent say their governance is lagging behind how fast their teams want to move. Sixty-three percent say their incident response breaks down at getting to the right data, not at making the call. Forty-seven percent still run security and engineering on completely separate technology stacks. And internal AI risk now keeps these leaders up at night exactly as much as external attackers (38% vs 38%, with another 21% saying both equally). The pattern across every question is consistent: AI is in place, but the underlying data isn't where it needs to be.
The fundamentals have shifted. When asked where the most time goes during incident response, only 3% of teams pointed at detection. Investigation alone consumed 34%. Another 21% named investigation plus action, and 5% named detection plus investigation. Add the 15% who said all stages, and 75 of 100 respondents name investigation as a critical chokepoint.
The reason is a data problem, not a lack of AI tooling. When asked where their incident response breaks down most often, 43% named getting to the right data as the single biggest friction point. Another 20% said both data and people. Only 4% said making the actual decision was the hard part. By a wide margin, the bottleneck is access, not judgment.
That data friction has a measurable cost. Of the 70 respondents who described some level of data fragmentation in their environment, 70% said it slows their team by hours during incidents. Another 11% said hours to days. Only 6% kept the slowdown at minutes.
AI has proven itself at the alert stage, where pattern recognition is the job. The next unlock is giving AI the same data context a human investigator would build by hand: logs, traces, metrics, and security signals in one place. A smarter model alone won't get teams there.
Forty-two percent of respondents describe their AI governance as lagging behind the pace their teams want to deploy. Eighteen percent say their governance is solidly keeping pace. Eleven percent describe it as moderate or mixed. The remaining 28% gave answers too qualified to bucket cleanly, but the dominant signal across the open responses is that policy is moving slower than practice.
That gap shows up in what teams worry about. When asked which side of the AI risk equation keeps them up at night more, 38% picked internal risk (shadow AI, data leakage, AI outputs they can't fully trust) and 38% picked external (attackers using AI to move faster). Another 21% said both equally. The two have functionally converged.
The substance of the internal worries is consistent across the open responses. Unauthorized models being used by employees. Sensitive data flowing into public models. AI-generated outputs that look authoritative and aren't. These are the risks created by an organization's own AI adoption, and they're now arriving faster than the controls being built to manage them.
External attackers have always evolved faster than defenses. The new wrinkle is that an org's own AI deployments now generate risk at the same speed. Without unified visibility into what AI agents are doing on both sides of the perimeter, security leaders are betting their governance gap stays small.
Among the 87 respondents who gave a numeric tool count, the median team works across 9 security and observability tools. Thirty-three percent run between 10 and 14. Eleven percent run 15 or more. The "one big platform" world isn't where most enterprises live.
The split between security and engineering tooling makes the fragmentation worse. Forty-seven percent of teams say their security tools and their DevOps or engineering tools are on completely separate stacks. Another 5% describe partial integration. Only 29% are on the same platform.
That fragmentation has a quantifiable bite. Among the 85 respondents who gave a numeric estimate, the median answer to "what share of security incidents get missed or delayed because of gaps between security tools and observability data" was 15%. The mean was 16%. About 18% of these respondents put the number at one in four incidents or worse.
Tool sprawl is more than a cost or efficiency problem. At its core, it's a measurement problem. Once 15% of incidents are slipping through the cracks between systems, the question stops being how to add AI on top of this stack and becomes what data the AI never got to see in the first place.
Twenty-five percent of respondents are already using agentic AI in their security work. Fifty-five percent are actively exploring it. Three percent aren't using it yet but plan to evaluate. Only 3% have ruled it out for now. Agentic security is a near-universal item on the roadmap.
What teams want before granting agents real autonomy is consistent. Asked for the single most important thing that has to be in place, 28% named human-in-the-loop approval as the top guardrail. 16% named audit trails and visibility into what the agent is doing. 12% named governance and policy. 8% named data quality and context.
When asked specifically whether unified visibility (across both security and observability data, with full audit trails and telemetry) would change their comfort level with AI autonomy, 39% said yes outright. Another 30% said it plays a meaningful role alongside other safeguards. Only 18% said comfort comes purely from limiting what the agent is allowed to do.
Teams ready to grant agentic AI real autonomy still want guardrails. What's specific about this moment is that they want guardrails they can verify in real time. Audit trails, telemetry, and unified visibility across both security and observability data are the prerequisites that make "human in the loop" workable at machine speed.
Asked to describe an organization that's nailing AI security at scale, respondents converged on a small number of attributes. Unified data across security and observability. AI taking on routine investigation and triage. Hard guardrails on irreversible actions. Strong governance built in early rather than added later. Fast, coordinated response. The picture they painted is consistent across roles, industries, and regions.
Datadog unifies security and observability on the same data, the same telemetry, and the same audit trail. Logs, traces, metrics, and security signals in one place means AI agents reason about complete context, human reviewers can see exactly what those agents did and why, and the gap between security and engineering closes from a stack-level problem to a routine workflow.
For the leaders surveyed here, that's not a future-state ambition. It's the prerequisite for the next 12 months.
See Datadog's unified security & observability platform →Source. A panel-recruited online conversational survey of 100 senior security and infrastructure leaders at companies of 2,500+ employees.
Screening. The starting sample was 744 raw sessions. After fraud screening (gibberish responses, templated cross-session duplicates, off-topic answers, low-effort response patterns), 84 sessions were removed as fraud and 560 were removed as legitimate non-qualifiers (wrong role or sub-2,500 employee organization). Five sessions slightly below the 2,500-employee threshold were retained because they otherwise met qualification criteria and showed substantive engagement; sensitivity-checking the headline findings excluding these five does not change any conclusion.
Bases & rounding. Percentages are calculated against all 100 respondents unless otherwise noted. Three sub-questions were branched based on a prior answer and have smaller bases: the fragmentation slowdown question (n=70), the tool count question (n=87), and the missed-incidents estimate (n=85). Bases are stated explicitly in chart subtitles and prose. Where rounding produces totals slightly off 100%, largest-remainder rounding has been applied within each chart.
Coding. The conversational format permitted free-text responses to every question. Open-ended answers were coded into mutually exclusive categories using keyword-based classification with manual review of borderline cases. Responses too vague or off-topic to fit a category are reported as "Other / unclear" rather than being forced into the nearest bucket.