World Foundation Models are improving the energy industry Applied Computing President Dan Jeavons explains how.
- Summary:
-
These types of models are essential for the next wave of physical AI that are explainable, accurate, and useful.
Innovations in World Foundation Models (WFM) will be critical for developing AI that can act on data about the world. These combine physics world models for simulation with Large Language Models (LLM) that process written text and drawings. Early work on WFMs has tended to focus on making impressive videos for improving driverless cars and robots.
Applied Computing has made progress on Orbital, a new WFM for the energy industry that is already showing promise in increasing efficiency, reducing costs, and optimizing maintenance schedules. It combines time-series sensor data, a physics engine, and LLMs with a decentralized data management backbone to build AI that is more accurate and explainable. This helps predict, optimize, and explain outcomes across a combination of real-time data in the field, historical data stored in ERP systems, as well as drawings and maintenance photos.
I recently spoke with Applied Computing CEO Dan Jeavons about the firm's approach and progress. Jeavons was previously VP of Computational Science at Shell, where he saw firsthand some of the challenges in building better AI models in the energy industry. Jeavons clarifies that this WFM is a bit more focused than general intelligence:
Our focus is on the energy sector, so we're being very specific in terms of what we're going after. So when you say a world model, you have to be careful, because it sounds a lot like Artificial General Intelligence, but for us, it's very much around a zero-shot model that can answer questions accurately regarding energy operations in the first instance. You can think of it as an expert system that can advise on a variety of problems, optimize energy systems, identify anomalies, and explain behaviors in near-real-time or real-time.
Need for accuracy and explainability
The energy industry has been at the forefront of building more advanced models for finding resources and operationalizing these for decades. Yet Jeavons sees an opportunity to combine traditional modeling approaches with LLMs to close the explainability-accuracy gap:
One of the questions that I've always asked myself, which is fundamentally why I ended up at Applied, is, ‘Why do we have to choose between them?’ Between explainability with less accuracy, and high accuracy with less explainability? Surely, we can bring these two modeling approaches together, and that's exactly what we're trying to do with Orbital. So, if you look at what Orbital is trying to do, it's trying to combine the best of both worlds, where we create a foundation model that is both physics-aware and hence explainable, but highly predictive and therefore extremely accurate. And that's what we're excited about, because we're seeing the results that show this works. We think it's a real breakthrough for the sector, because we've never been able to do that before, not to this level of accuracy and explainability.”
Physics meets neural networks
The cutting edge of physical AI today lies in using neural networks to develop more efficient machine learning models for making predictions. These can be thousands or even millions of times as efficient as traditional physics models, but they can also make different kinds of errors. Jeavons says a seminal part of their work lies in combining deep domain knowledge from several experts in the energy industry with leading-edge AI approaches for building a more capable WFM.
The energy industry has consistently leveraged the latest innovations in data infrastructure, computing, and AI for decades. Yet there was room for some improvements. The first were innovations in enterprise data management. The second is that LLMs open up new opportunities. Jeavons says:
The first big breakthrough is the ability to use an LLM as a human interface that can interrogate and allow the human to interact with a system, as you would speak to a person. That I think is fundamental, and it unlocks a lot. And I think for us that's what's really been behind the creation of the foundation model, the ability to have an LLM that's using a combination of fine-tuning and RAG [Retrieval Augmented Generation] to allow us to then have an intelligent conversation with an expert, as though it's an expert in the energy industry.
However, the transformer architecture underlying LLMs also presents additional opportunities for interpreting time series data in targeted ways. For example, this might involve finding more effective ways to transform low-level sensor feed data into relevant events at various scales. Jeavons explains:
We've seen real breakthroughs in that space, where we start to see material improvement in forecast accuracy. And for us, that transformer architecture combined with the physics engine, which links into the physics-informed neural network world and into the world of operations. That combination is what's allowing us to unlock some of these new emergent capabilities.
Improving self-learning loops
One essential aspect lies in weaving data from physics engines, LLMs, and events surfaced in time series data to guide the development of self-learning systems, as we are doing with self-driving cars. Jeavons says:
So, when you put all that together, it's a real step change in what AI can do. And I think that's what's getting us so excited.
The fundamental insight developed by Applied Computing Chief AI Officer Samyakh Tukra was that LLMs describe the world but don’t grasp it. Existing physical world models predict what happens, but don’t let you reason. Tukra began exploring how these approaches might be combined during a brief stint working on Jeavon’s team at Shell. He continued to develop these ideas on his own and showed them off to Jeavons, who recalls:
I was both frustrated and also excited. Frustrated, because I think Sam cracked something really fundamental that I hadn't managed to crack yet. And I think the second part was really excited, because it created the opportunity to do what I'd always wanted to do in the energy sector, and that was what sparked the conversation about me coming on board as president, because I think the technology was too exciting to let it pass and wanted to try and take this to scale, so it wasn't in the life plan.
Applied combines an expert LLM, a physics model, and a time series model, which interact together to provide an integrated answer grounded in expertise specific to the context. This could be the energy unit or a part of a process. The time series data includes variables such as temperature, pressure, and flow rates through equipment. The expert system informs the physics engine in terms of equations and coefficients that govern that. Jeavons explains:
Orbital allows those three to act in tandem and in a mutually reinforcing way, such that it provides expert answers back to the users which are accurate, and if Orbital cannot answer the question, it will tell you, ‘I can't answer it.’ That takes the hallucination out. We then wrap up all of that in a verification layer to make sure that what we put out to the user is accurate. So, in a nutshell, it's an agentic system. We consolidate all of these elements into a single, integrated foundation model, which can serve as a comprehensive framework to address a wide range of questions. It can predict, optimize, and explain. And it turns out that those three things are some of the most foundational concepts in any energy operational process.
Increasing the value of data
Another significant insight is that energy companies collect a tremendous amount of data on their operations, but only a small fraction of it tends to be stored in enterprise data lakes for further analysis. Most of it gets filtered out and aggregated in the field rather than being used to train better models in the cloud.
Distributed control systems collect a tremendous volume of data at sub-second increments that is stored locally. Humans rarely look at this raw data unless they are trying to troubleshoot an alarm.
Another level of nuance is that other types of data, such as maintenance records in ERP systems or integrity history stored in integrity management systems, as well as photos of assets and drawings over time, often stored in an engineering data warehouse, each stores information about operations in different forms. Jeavons explains:
Typically, what happens is that each discipline within one of these operating units will use a part of that data to drive a set of KPIs, which are then aggregated to the leadership team so they can look at how the overall plant is operating. However, from a data management perspective, what is effectively happening is that you have aggregation on top of aggregation, ultimately resulting in simple calculations that are represented in the set of KPIs. Now all of that data is available and stored, which can then be interrogated by a foundation model. And that ability to unlock that collective data set is truly transformational, if you can get after it. So that's really what I mean by only eight percent of the data being used.
Applied Computing has been partnering with Databricks on the data plumbing for this architecture. Jeavons says this provides a foundation for securely and reliably bringing data into the cloud. The Databricks Delta Lake architecture also provides a foundation for running highly performant analysis on data at scale.
Another important insight is that this kind of system needs to operate at multiple timescales to provide real-time advice and improve long-term diagnostics. The real-time models need to run and be connected to the real-time distributed control system, and they must be air-gapped for security reasons. However, it's not realistic to train these models on the edge due to compute requirements and the need to correlate them with other data, such as schematics, ERP, and integrity management systems. Jeavons says:
This is why it's really important that you think about edge and cloud. You need a hybrid system. It can run on the edge in real-time and provide real-time recommendations through rapid inference, connected to where decisions need to be made in the short term. Still, it can also inform longer-term decision-making and planning by working in the cloud as well.
Lessons for other industries
Jeavons acknowledges that outsiders sometimes view the energy sector as backward, conservative, or inefficient. However, the challenge lies in their operation in some of the harshest environments in the world, which involves the use of highly volatile substances. Most of the time, they do a good job of keeping us safe. Also, fiscal conservatism and a focus on outcomes have bred a discipline in which AI has been deployed within the industry to generate consistent returns. Jeavons explains:
Most of the majors have come out and talked about material value creation from AI in one way, shape or form. Shell certainly did. There is a discipline around ensuring that this pays, which I think the energy industry has done well in general. And I think the reason is that we're used to getting value out of models, whether that be design, seismic processing, or optimization of facilities. A lot of this isn't new. We've been getting value out of these models for decades. And so that value extraction piece is pretty impressive, and something that I think other industries can learn from, even if we're not necessarily as fast or as aggressive in the deployment sometimes.”
While at Shell, Jeavons also worked with a variety of new vendors who would often come in with only half the information and assume that more data would make the world better. But many of these vendors failed to understand how Shell’s work really happens:
This means that the AI might be killer at predicting or optimizing something, but it's not usable because it doesn't fit within the way in which the process works or the regulatory environment that you have to operate in. So that appreciation for the workflow is really key. I think the other thing is recognizing the importance of change management in an environment like this, where people are really incentivized to ensure things don't break and go wrong. The big prize is operating the plant as safely and efficiently as possible, for as long as possible. And that's ultimately the objective. And so if that's the objective, that means innovation is not always front of mind. It's about understanding how the AI we're developing can support those objectives, which is why we focus so much on explainability and accuracy.
Because if we're not explainable and accurate with the AI we're developing, it's very hard to justify why people should take a risk based on the new recommendations we're introducing. We need to start with the workflow and then focus on change management, ensuring that what we build is something that can be appreciated and understood by people on the front lines. Some of the work we're doing at the moment, combining time series, data, and physics, is breakthrough technology. That is very transferable to other industries. I think it goes beyond what I've seen in other sectors.”
My take
It will likely take a while for Artificial General Intelligence systems to emerge. In the meantime, there is a tremendous opportunity to improve industry-specific AIs through a combination of LLMs, enterprise data, and traditional physical models, creating domain-specific world foundation models. Of seminal importance is finding ways to help inform embodied AI agents that improve at specific tasks over time using reinforcement learning or active inference processes.
Another significant takeaway is the need for more effective ways to leverage time series data in all of this. In the 1990s, the term "complex event processing" provided the conceptual scaffolding for making sense of raw data across various systems. However, this also required a significant amount of work to make sense of events at different scales within a single stream or across multiple streams. It sounds like transformer approaches used in LLMs are becoming increasingly adept at automating more of this than was previously feasible, even just a few years ago.
Lastly, it's also important to determine how these WFMs can integrate into existing workflows, rather than just providing better models. This will require striking the right balance between accuracy, explainability and utility.