What should enterprises build with agentic AI?
- Summary:
- After a whirlwind few weeks of hearing about agentic AI, I have been wondering what are the automations that enterprises will end up building with these generative AI-powered intelligent agents.
Just a few weeks ago, almost no one had heard of agentic AI. Now we can't go a minute without some new claim from a tech vendor featuring the term. This is the new, new thing in AI, superseding the era of AI co-pilots, the last new, new thing, which is now looked down upon as a simple beast of burden that merely fetches and carries information at the behest of its master. Agentic AI — which to be honest is just a jargonistic way of saying 'intelligent agents' — can be given a task and go off and complete it, in exactly the same way that you might delegate work to a junior colleague.
This does sound like an advance, and tech vendors who believe they've got there first with this new class of AI-fueled agents are very excited to see their customers take them on board. But what will enterprises actually do with agentic AI that moves their organizations forward? What are the unmentioned gotchas they need to be wary of? And how soon will agentic AI be superseded in turn by the next new, new thing? These are the questions I can't get out of my head.
The past few weeks have been a whirlwind of agentic AI. This week sees Salesforce roll out general availability of Agentforce, the agent platform it unveiled last month at Dreamforce and which CEO Marc Benioff likes to contrast with Microsoft's Copilot offering, disparaging it as "Clippy 2.0" in a reference to the ill-fated Office 97 chatbot. Last week, I spoke to work management vendor Asana as it launched AI Studio, its no-code platform for building AI agents, which is now on early release to enterprise customers. Meanwhile my colleagues Jon Reed and Alyx MacQueen were at UiPath's annual conference, where the vendor described agentic AI as its second act, following on from its origins in Robotic Process Automation (RPA) — which was the new, new thing quite some time ago. My own travels earlier this month took me to Atlassian's Team '24 Europe, which coincided with general availability of its Rovo AI assistant, and then to IFS Unleashed, where the enterprise applications vendor was leading on Industrial AI. What to make of it all?
My first takeaway is that all this talk of agentic AI really boils down to something quite simple once you clear away the marketing hype and its smoke and mirrors. Think of this new generation of agents as a more flexible user interface that sits on top of the existing systems and data. Earlier generations of chatbots and agents could only execute highly structured instructions in predetermined ways. But generative AI means that these new agents are far better at both extracting meaning from unstructured information such as conversational interactions or collections of documents, and at mapping possible actions to anticipated results. Whereas before you would have to carefully map every step that you wanted an agent to carry out, now you can just say to the agent, 'Follow the policies and processes written down in this set of documents, and choose the actions that will produce the desired result with a given set of data. Then check back with me for approval before going ahead.'
What's the catch?
But there's a big catch. The agent's ability to fulfil that request is massively dependent on the accuracy and robustness of the data and automations in the underlying system. This is what Benioff is getting at in his criticisms of Microsoft Copilot. He believes that Salesforce has a much more robust object model than Microsoft for making sense of information and carrying out actions across its applications — we'll hear more of Microsoft's side of the argument at its Ignite conference in a few weeks' time. The likes of Asana and Atlassian make a similar argument based on their creation of proprietary work graphs that map the various entities and relationships that their applications manage. They've already done the hard graft to bring structure to all of the information in their systems, which gives their agents a head start in making sense of it. As Paige Costello, Head of AI at Asana, told me last week:
We've been building this platform for over a decade for human-coordinated work — who is doing what, by when and why... So the insight here is that actually the structure of the information, yes, is critical to a high-quality output, but is also critical to increasing the probability that the activity or the work is correct, and then increasing how detectable it is if it, for some reason, isn't.
Another way of looking at this is that the agents themselves don't need highly structured instructions and predetermined actions because the systems and automations they're working with are inherently deterministic. The agents use the probabilistic reasoning of generative AI to figure out what's intended and how to deliver it, but the underlying automations remain within predetermined guardrails. As Daniel Dines, founder of CEO of UiPath told Jon Reed last week:
You cannot predict the answer of Gen AI. It's simply impossible to predict the answer. So while we all understand that it's extremely powerful, its own nature makes it extremely difficult to use it in the context of an enterprise workflow, because enterprise workflows needs to be reliable, deterministic. And our job right now it's actually to make it exactly as I said, reliable and deterministic and capable of using in an enterprise workflow.
There's an inherent trade-off implicit in this arrangement, which is that in going with a given vendor's platform, an enterprise is accepting its built-in data models and processes. That may not be a big issue considering that we're largely talking about SaaS vendors here, where that trade-off was already made long ago. But it limits choices to configurations within those existing constraints.
When it comes to the underlying Large Language Models (LLMs) that these agents use, therefore, enterprises often won't have complete freedom to choose. In a briefing with analysts yesterday, Adam Evans, SVP of Product Management for Salesforce AI, said that Agentforce uses specific proprietary guardrails that, for example, monitor the internal monolog of the reasoning engine, which wouldn't be available if enterprises plugged in their own LLMs or agent frameworks. That may change in the future, but for now Salesforce wants to plow on with developing its own technology. He went on:
There's just so many cool things that we want to make sure that we're doing and we're going to be focused, I think, on us first, in terms of how we want to do it, more end-to-end... So we're thinking about it step-by-step. We're prioritizing for the things that we can control first to have the best experience we can, to get this technology out at scale. But we don't want it to be a closed system.
Is it really so impressive?
Let's not get starry-eyed about the early successes, either. Salesforce presented some Agentforce customer case studies, with educational publisher Wiley and restaurant booking service Opentable both freeing up significant resources in their customer service teams through automated agents. But these are examples of high-volume call centers where many incoming issues will be very similar and thus highly susceptible to effective automation. Complex B2B sales are likely to prove more challenging and demand more sophisticated orchestration. Evans told me:
Most of our pilot customers so far have been focused on more high-volume scenarios... There's a reason for that. There's a lot of value there for things that are this high volume, both for us and its unserved potential for customers. So that's where our focus has been. That is, I think, where it will continue to be for some time, just as our customers drive our roadmap here.
The focus on serving customers within existing platforms also means that, even though vendors are already opening up connections out to other platforms, there are many unanswered questions about how agents will co-ordinate work in cross-functional, multi-application scenarios. Integration vendor Boomi has been talking up the need to govern and manage the enterprise agent portfolio through an agent repository it is currently developing. Another reason for centrally tracking agents across the enterprise is to avoid a wasteful proliferation of agents that each do the same thing slightly differently, leading to an accumulation of both technology debt and productivity-sapping process debt.
The other drawback to being tied to the existing automations embedded in today's applications — or even more so, the robotic automations of what's embedded in yesterday's applications — is that there's no re-evaluation of what we're automating here.
Now I'll grant you, this is where the low-hanging fruit lies. Atlassian provided several examples from its own experience of how generative AI can extract relevant information and present it in digestible format where it's needed, reducing the time taken to complete internal processes. Asana cited the example of its customer Morningstar, which set up a smart workflow using the AI Studio to automatically review work requests and request further information when needed, reducing the time taken to complete project reviews by two weeks on average. Interestingly, in our conversation Costello also noted that customers have found the objectivity of the AI agent is useful in taking some of the drama out of work requests that people tend to inject. She told me:
An example we heard from customers was around people making requests and always assuming their requests were urgent. When they added the AI agent into their workflow to make those elections... it was able to say, 'Okay, you want this for next Thursday, and this is the type of work. Is it actually urgent and not going to get turned around in time? Or is that a perfectly fine thing, and this isn't a hair-on-fire request?'
They said that they were impressed that the prioritization was more accurate, more clear and more trustworthy than when people were doing it.
My take
There are potentially thousands of processes in most enterprises where introducing better workflows can reduce the time taken to assemble information, improve prioritization and scheduling, and select the best action to move work forward. This is not some miraculous consequence of deploying AI. It's just that AI makes it easier and more palatable to introduce long-overdue process standardization and automation.
But going back to my earlier point, doing a better job of introducing existing automations isn't necessarily the best use of a new technology. As one panelist at Atlassian's recent event remarked:
We need to challenge the way we’re working today. We shouldn’t just use this AI to give an existing process artificial intelligence, without initially rethinking, is this the way we want to work tomorrow?
In the middle of the twentieth century, the way cargo was loaded onto ships was revolutionized by the introduction of internationally standardized cargo containers, massively reducing transportation costs and speeding up the supply chain. In the early years of the twenty-first century, the way that compute resources were deployed to data centers underwent a parallel revolution, replacing manual configuration of individual machines with command-line driven deployment of virtual containers. This current phase of agentic AI is the equivalent of using a better crane to randomly lower cargo into a ship's hold, or finding a faster way of racking up PCs in a cloud data center. The key to more revolutionary advances is standardization, but we are barely at the beginning of the revolution in process standardization that agentic AI — or whatever new, new thing replaces it — will ultimately produce.
In the meantime, enterprises should continue to get their data in order and build automations that deliver rapid wins today, but keep a close eye on the emergence of practical new standards that will herald much bigger advances in the future.