From chatbot interactions to operational agents - what enterprise deployments reveal about AI readiness today
- Summary:
- Databricks’ latest research shows rapid growth in agentic AI deployments, but enterprise progress remains shaped by data visibility, governance, and operational oversight. EMEA CTO Dael Williamson explains what organizations are learning as AI moves into production systems.
As enterprise AI programs move beyond experimentation, organizations are confronting a more complex operational reality. In fact, according to Databricks’ latest State of AI Agents research, enterprises are shifting from chatbot-style interactions toward agentic systems – AI systems that can plan and execute tasks with limited human input – capable of orchestrating workflows, evaluating outcomes, and acting inside production environments.
But adoption data and practitioner experience suggest that foundational challenges, particularly around data visibility, governance, and operational oversight, continue to shape what organizations can achieve. Drawing on platform telemetry from more than 20,000 organizations and his experience working with customers across industries, Dael Williamson, EMEA CTO at Databricks, describes how the emerging agent ecosystem is exposing structural gaps in data management, monitoring practices, and organizational capability. The findings suggest that enterprise AI adoption is becoming less a question of model capability and more a question of organizational capability – particularly the ability to monitor, govern, and interpret autonomous systems in operation.
AI adoption remains constrained by data visibility
Despite years of investment in data infrastructure, many organizations still lack a clear inventory of their data assets. In discussion, Williamson argues that this remains the primary barrier to scaling AI initiatives.
The bulk of enterprises are still figuring out their data estate. Large multinationals, large enterprises, still don’t know what data they have. They have a list of financial assets, a list of physical assets. They even know what humans, for the most part, work for them. But they haven’t done that stock take.
This lack of visibility limits how organizations deploy AI systems and what information they can reliably use for automation or decision-making. The issue is compounded by narrow definitions of data, which often exclude operational processes, system interactions, code, and other forms of organizational knowledge.
Williamson described a broader view of enterprise data that includes everything from system telemetry and tracing – operational data that records how systems behave over time – to process interactions. These operational signals can reveal how work actually happens inside organizations, which rarely follows formal process definitions. A “six-stage process,” he noted, typically involves “about a thousand interactions,” reflecting the complexity of real-world workflows.
This expanding definition of data is increasingly relevant as organizations deploy AI agents inside business processes, where incomplete context can quickly affect performance and trust.
From chatbots to agents – and new operational demands
The Databricks report highlights growing adoption of agentic systems capable of performing specific tasks autonomously, coordinating with other agents, and operating within enterprise workflows. Unlike chat-based interfaces, these systems require persistent memory, sequencing capabilities, and ongoing evaluation.
Williamson describes the shift using an orchestration metaphor rather than the commonly used concept of “swarming”, (an industry term describing multiple autonomous agents coordinating tasks together).
I tend to think of it more like an orchestra and having a conductor. When an orchestra is not in symphony, you can hear it. When an orchestra is playing well, you can see how that creates meaning.
Breaking work into smaller tasks allows organizations to monitor and optimize agent behavior more effectively. Databricks’ research also points to the emergence of supervisory and evaluation agents that assess the performance of other systems, tracking behavior over time and identifying opportunities for improvement.
Such systems introduce new infrastructure requirements. Williamson describes the need for environments designed specifically for agent workloads, where systems create temporary databases, run simulations, and evaluate multiple possible outcomes before taking action. In these environments, a large proportion of system activity may be generated by machines rather than humans. He elaborates:
80% of the users are non-human. They create lots of databases and then they kill them… They can run a bunch of simulations and decide which one’s the best path.
This operational model places new emphasis on monitoring and oversight capabilities, requiring organizations to track system behavior and intervene when outcomes deviate from expectations.
Monitoring probabilistic systems
As AI systems become more autonomous, traditional approaches to monitoring software performance are no longer sufficient. Conventional systems operate deterministically, producing predictable outputs from defined inputs. Agentic systems behave probabilistically – producing variable outcomes rather than fixed results – and continuously evolve.
This shift introduces a practical operational challenge. Once AI systems act autonomously inside production workflows, organizations must decide who is responsible for monitoring their behavior, identifying drift, and intervening when outcomes deviate from expectations. Traditional software governance models – built around deterministic systems and periodic oversight – offer limited guidance for this continuous supervisory role. As a result, enterprises are beginning to confront a new capability gap: managing AI behavior in operation rather than simply deploying AI systems.
Williamson compares the required monitoring approach to air traffic control, where human operators track system trajectories and intervene when necessary:
We’re starting to learn that we need to be really good at observability, monitoring, that flight traffic control type of analogy. We don’t really have that skill today.
The shift requires organizations to rethink telemetry and tracing technologies that were originally developed for software systems. While these tools provide forensic insight into system behavior, they must now be adapted to handle probabilistic decision-making processes. The growing importance of telemetry also reflects a broader push toward transparency. Williamson emphasizes the need for “glass box” visibility into AI behavior, observing that opaque monitoring approaches introduce liability and undermine trust:
The black box for telemetry will not work. We need to figure out how these behaviors change in a transparent and glass box way.
Governance expands beyond compliance
The research also identifies governance and evaluation frameworks as key factors in moving AI systems into production. According to the company’s findings, organizations that implement governance and evaluation mechanisms are significantly more likely to operationalize AI systems successfully. While the it identifies a strong association between governance practices and production outcomes, it also reflects broader organizational maturity – suggesting that governance may function as an indicator of operational readiness as much as a driver of it.
Williamson notes that the term “governance” itself remains ambiguous across organizations.
The meaning of governance is a tricky one. We don’t unilaterally have the same meaning.
In practice, governance extends beyond administrative oversight to include data management, model usage controls, evaluation testing, and system monitoring. It may also encompass both structured and unstructured data, as well as the features and inputs used by machine learning models. This shifts governance from a compliance function toward an operational discipline focused on ensuring systems behave as intended.
This has become what my colleague Jon Reed would call a "pesky question" in my conversations with vendors. The same ambiguity applies to related concepts such as guardrails, observability, and context – terms widely used across the AI industry but interpreted differently by different stakeholders. Williamson notes:
I think a lesson in linguistics will be very good for the market.
Organizational change and challenging assumptions
Beyond technical infrastructure, Williamson emphasizes the importance of organizational capability. Deploying agentic systems requires new skills in monitoring, evaluation, and operational oversight. At the same time, organizations must modernize existing systems and reconsider how humans interact with automated processes. He remarks:
Upgrading how humans work… people are starting to accept that’s way more of the path forward.
This shift requires redefining roles, developing new forms of operational expertise, and integrating AI into existing workflows without compromising accountability or transparency.
Williamson also suggested that organizations may need to rethink long-standing assumptions about data management priorities. Rather than focusing primarily on data quality, he argued that semantic understanding – shared meaning and context across data and systems – may prove more critical.
Data quality is not as important a problem to solve as the semantic understanding,” he said. “If we start from the point of clear understanding, quality improves.
Techniques that convert unstructured information into structured insights may help organizations address long-standing data challenges by improving contextual understanding and aligning organizational language. This shift could enable more effective use of enterprise data while reducing fragmentation across systems and teams.
My take
A consistent theme in Williamson’s perspective is the growing importance of operational discipline in enterprise AI adoption. Agentic systems introduce a new class of production workload – one that requires continuous monitoring, evaluation, and oversight rather than one-time deployment.
The practical implication is organizational. AI governance is evolving from a compliance function into an operational capability responsible for observing system behavior, managing risk, and maintaining trust over time. Many enterprises have not yet developed this capability.
The conversation points to a shift in priorities for enterprise AI. Monitoring, evaluation, and semantic clarity are becoming foundational capabilities rather than supporting functions. The immediate challenge for many organizations is operational readiness – establishing the capability needed to oversee AI systems once they are running in production. These conditions will shape whether enterprises can make meaningful progress in managing their data estates.