Box's latest agent co-ordinates multiple capabilities focused on enterprise content

Phil Wainewright

April 9, 2026

Dyslexia mode

Summary:: The new Box Agent co-ordinates multiple AI capabilities and can be customized to perform complex or long-running enterprise content processes.

Screenshot showing Box Agent performing sales analysis — Screenshot of Box Agent performing sales analysis

Last week, enterprise content management platform Box unveiled its latest AI agent offering. Instead of earlier agents that were dedicated to specific tasks, the new Box Agent is a multi-purpose assistant that can orchestrate various capabilities to perform a range of different tasks, including complex, multi-step processes. Yash Bhavnani, Head of AI at Box, tells us:

We've brought all these capabilities together so it becomes a lot more intuitive. You don't have to decide if it's A agent or B agent or C agent. It just does the work for you.

Customers can also create custom instances of the agent using Box's AI Studio to perform more complex tasks such as employee onboarding. These custom agents typically draw on curated source information that's been gathered into a virtual content store called a Box Hub. She explains:

The Box Agent is really good for your day-to-day work, for things like asking a quick question from your data, getting the right files, creating and generating documents, doing some analysis. But sometimes you need something a bit more purpose-built, and this is exactly why we built Box AI Studio, that can use Hubs.

Once an agent has been created for a task such as new employee onboarding, it can be kept up-to-date with current policies as they evolve by simply adding to or changing the content held in the Box Hub. She elaborates:

With Box AI Studio, what I could simply do is make an agent that says, 'You are the onboarding agent for my 10,000-person company. Here is a Box Hub that you're going to use as reference knowledge.' Now, as an HR person, you could continuously keep adding documents to that hub. You don't need to touch the agent. The agent will just be able to use that...

Now, when a new employee comes in, I say 'Hi, select this agent, which is your onboarding agent, and you can ask it questions — like, how do I sign up for 401K, when's the next Box holiday?' all of those things — and it would come from the correct, legitimate source. So I think for very specific purpose-built areas such as onboarding, new hiring, brand marketing, it really does make sense to have a collection of content that the agent refers to.

The Box agent is able to harness a range of foundation models from different providers including OpenAI, Anthropic, Google Gemini and Grok, selecting the appropriate model for each specific use case. It adds value to those models by drawing on all of Box's know-how about enterprise content and the structure, permissions and context that surrounds it. She comments:

Bringing together this content and the structure and understanding, I think, is key to making that Box agent all-round capable and useful...

We really add on this structure and context layer that Box understands, which is things like, what's the file, what's the folder system? Are you using the right folder? What's the version? When I say, 'Find the things that are important for me,' does it understand who 'me' is? Does it understand the files that are most recent to my organization?

Growing volume of unstructured data

Box makes the point that the majority of enterprise knowledge is stored and captured in unstructured data — content such as contracts, proposals, product specs, knowledge bases, chat histories and recordings of meetings and calls — rather than the structured data of transactional enterprise applications. The advent of generative AI, with its ability to create new content and summaries, will only add to the volume of unstructured data sources. She says:

Now you're using AI to generate even more content. Whether that's a session [where] you've asked AI some great questions, you've got some insights, that's content now. Whether you've generated a PowerPoint now, you're just compounding this problem of more unstructured data. And it's going to happen very fast, because, as you know, AI writes a bit faster than all of us put together.

So I think it's essential now, that you have to have a really good content AI and document structure in place, that is secure at the bottom, to be able to make the best of all the content that you're generating and all the content you [already] had.

This isn't about generating AI 'slop' — this is content that's a useful addition to the enterprise content store. She continues:

The usage pattern changes from 'Hey, generate me a blog post that I can talk about,' which is probably content that's not as significant for your enterprise, because it's purely generated by AI, to some of the work that I do with the Box agent, which is like, 'Okay, write me a two-page summary of all the interviews that we did for our beta clients, and what were the main things that they extracted?' Now that's content that I actually want to add back into the file system, and that is useful for the next person...

You're going to find what I call content that's co-created with the AI that, again, takes into account your file structure, takes into account your thoughts, and helps to bring out a bit more of that in a nice, concise way that you do want to add back to your file system, because it's helpful.

Customers who have had early access to Box Agent have found it particularly useful in two distinct scenarios, she says:

You want to think of two categories. One is, what's the problem that you're wasting time on, the burdensome problem? That's what you should point your AI agents at, because that would free up your experts to do the things that really mean more to your business.

The second one is, what's the hard problem you couldn't figure out? That's where you should point your AI, because then you can actually do those new things that you couldn't have done before.

For example, one customer in financial services has been using the agent to review documents that financial analysts were previously researching manually. Now the agent shortcuts the process of identifying new information that analysts might want to discuss with their clients, enabling more frequent face-to-face conversations, improving client satisfaction, and potentially increasing the number of clients each analyst can handle. Another agent analyzes trends across the client base and identifies opportunities for new products — something that wasn't previously possible because there wasn't the resource available to carry out this type of big-picture research.

Other examples from Box of work the agent can do include drafting responses to RFPs based on retrieval and analysis of product documentation, messaging and compliance guides, reviewing vendor contracts against company policy, or producing a report on top product feature requests based on analysis of customer service interactions and feedback documents.

Some of these more complex tasks require additional capacity that is available to customers with an Enterprise Advanced license. The agent's 'Pro Mode' uses a wider selection of foundation models to enable a higher level of planning, execution and refinement. Work that requires more extensive query capacity, such as tasks that involve high-volume repetition or which include mission-critical, long-running processes, may also require 'Expanded Mode', which increases the agent's context window to four million tokens.

My take

We seem to be moving on from the era of task-specific, role-based AI agents to more general-purpose assistants that don't require users to remember which agent to use when they want something done. Box Agent is an example of this new breed of multi-purpose agents, able to autonomously select the appropriate resources to produce the desired result. But loet's not get ahead of ourselves. There's still plenty of tasks where the agent needs additional guidance, and where customers will need to custom-build more specialized agents for those complex processes within each enterprise that are done a particular way, and which draw on specific knowledge. Here, the Box Agent serves as a foundation that can be tailored to those specialized processes.

In both cases, Box's existing know-how about enterprise content and the processes around it forms an important part of the context the agent draws on to ensure its results are pertinent and reliable. This is an example of the Systems of Knowledge that enterprise application vendors have built up over many years of serving the market and is an important defense against emerging AI-native competitors. I would argue that it's also a particular advantage for vendors in the collaboration and teamwork space, like Box, as opposed to those vendors that have historically focused more on transactional systems. This is because their know-how is mostly focused on unstructured data, which generative AI is particularly good at interpreting, when given the right context and guardrails. There's immense value to be realized from increased knowledge velocity, both in speeding up the extraction of recognized knowledge, and also in accessing and analyzing as-yet unrecognized sources of knowledge or processes that yield new value.

It will be interesting therefore to see what customers start doing with the latest iteration of Box's agent technology, and how their experience will shape the continued evolution of its capabilities.