Enterprises pour billions into data lakehouses. So why can't the business use the damn things?
- Summary:
- Alteryx is pushing analytics into the data lakehouse rather than pulling data out. Chief Product Officer Ben Canning explains why governance and business user access remain the real barriers to lakehouse ROI.
Ask a business analyst how they get data from their company's lakehouse and the answer is often the same: they file a request, they wait, and somewhere between days and weeks later, they get something back that may or may not answer the question they originally asked. It is a process that would be familiar to anyone who worked in enterprise reporting 20 years ago – which is exactly the problem.
The enterprise data landscape has undergone a significant consolidation over the past several years. Organizations have invested heavily in cloud data platforms – Snowflake, Databricks, Google BigQuery – creating centralized lakehouses designed to bring structure, governance and analytical power to vast volumes of data. On paper, the architecture makes sense. On the ground, many organizations are still struggling to get value from it.
The people who need the data most – the business analysts, the finance teams, the operational leaders making day-to-day decisions – often still cannot get to it without filing a request and waiting. The lakehouse was supposed to fix that. For many organizations, it has not.
That gap between data infrastructure and data usefulness is at the heart of Alteryx's current strategy. I spoke with Ben Canning, Chief Product Officer at Alteryx, about the company's expanded in-warehouse analytics capabilities, including its latest LiveQuery integration with Google BigQuery, which builds on existing partnerships with Databricks and Snowflake.
The lakehouse's user problem
Canning frames the challenge bluntly. Organizations made powerful investments in consolidating their data, but getting the business to actually leverage it remains difficult:
Those data lakes are sort of designed for IT. They're designed for heavily technical folks. They require a lot of SQL and Python knowledge to take advantage of.
SQL – Structured Query Language – is the programming language used to query databases, and remains a barrier for business users who lack engineering backgrounds. Yet the people with the deepest business context – those who understand what the data means and what decisions depend on it – are often the least equipped to access it in its current form. Canning sees Alteryx's role as bridging that divide:
If we could be a bridge to unlock the potential of those data platforms, it could be a huge win, both to improve the ROI of those platforms and also to improve governance within the enterprise.
This is an interesting claim. The typical approach to democratizing data analytics has been to give more people access to the warehouse, either through self-service tools or simplified query interfaces. Alteryx is making a subtly different architectural argument: instead of pulling data out, push analytics in The distinction matters most when it comes to governance – which remains one of the persistent choke points enterprise practitioners report alongside data quality.
In a typical workflow, a business analyst might extract data from a governed platform, combine it with local spreadsheets, run their analysis and produce a result. The data has now left the governed environment. Nobody can audit what happened to it along the way. Multiply that by hundreds of analysts across an organization and governance becomes a policing exercise rather than an architectural feature.
Alteryx's LiveQuery approach is designed to work differently. When a business user needs to merge warehouse data with their own spreadsheets – which, as Canning acknowledges, is the reality of how most business analysis actually works – the system pushes the spreadsheet data into BigQuery as temporary tables. The processing, the merge and the result all happen inside the governed platform. The data never leaves, so the governance is structural rather than aspirational. That means a big shift from enforcement to design, for IT teams that have spent years trying to prevent data sprawl through policies and permissions.
The Golden Gate Bridge problem
When I asked whether Alteryx's multi-platform support effectively positions the company as a neutral analytics layer across cloud environments, Canning's response was pragmatic:
Many enterprises have a desire to get onto one single system, but I've rarely seen that play out in reality. It's kind of like the Golden Gate Bridge here in America. As soon as they finish painting it on one end, they start repainting it on the other.
It is a vivid analogy for a real enterprise condition. The aspiration towards consolidation is perpetual; the reality is that acquisitions, leadership changes and competing platform commitments mean most organizations will continue running multi-cloud environments regardless of their stated strategy. Canning argues this reinforces the need for a processing engine that remains neutral across platforms:
There is still that need to be able to connect and bring all of those things together in one processing engine that is designed to be fairly neutral to their system and work with all of them.
For decision makers evaluating their analytics stack, this raises a practical question: is it more realistic to consolidate your data infrastructure or to accept its fragmentation and invest in a layer that works across it?
Canning signals that the next phase of Alteryx's strategy moves into agentic territory – enabling business users to publish analytical workflows as agents accessible through Slack, Teams or ChatGPT, with full traceability of how answers were derived. The company plans to share more detail later this year, and its Inspire conference in May is likely to be the venue for those announcements. The emphasis on provenance – knowing exactly where an answer came from and how it was arrived at – is consistent with the governance thread running through Alteryx's current positioning.
My take
The data lakehouse was supposed to democratize analytics. For many organizations, it has democratized data storage instead – creating vast, well-governed repositories that remain inaccessible to the people who need them most. That is not a technology failure so much as a design assumption: lakehouses were built for data engineers and assumed the business would follow. In many cases, it has not.
Alteryx's in-warehouse approach addresses a real structural problem. By pushing analytics into the governed environment rather than pulling data out, the company is making a practical argument about how governance should work – part of the architecture rather than layered on through policy. Considering that so many organizations have invested heavily in their data platforms and are now lost in the fog trying to show ROI beyond the data engineering team, that deserves scrutiny. Scalability into real enterprise environments is where the approach gets harder – where data quality is uneven, spreadsheets remain the default tool for most business users, and the gap between IT and business users is cultural as much as technical. Alteryx's Google partnership and planned marketplace presence suggest it is betting on depth of integration rather than breadth of features.