How ‘vibe analytics’ democratizes serious play
- Summary:
-
Plotly Co-founder and CPO Chris Parmer and MIT business guru Michael Schrage explain how vibe analytics streamlines data exploration and the discovery of actionable business insights. It also introduces novel risks.
Researchers and vendors have been working to democratize analytics for decades. Spreadsheets, Online Analytical Processing (OLAP), and visual analytics each ushered in a new era of citizen analytics in their day.
Plotly is now pushing this envelope a bit further with so-called ‘vibe analytics,’ inspired by the vibe coding meme in software development. The company has pioneered open source tools and is stewarding a community for using LLMs (Large Language Models) to write analytics code in languages like Python. This promises to democratize analytics a bit more – for better and worse.
Vibe analytics can lower the bar for discovering new insights using a variety of analytics and visualization techniques. Business users can explore novel graphing techniques like heatmaps, histograms, and Sankey diagrams without having to learn the nuances of Python coding libraries.
Chris Parmer, Plotly co-founder and Chief Product Officer, explains that users can now bypass the rigid learning curves of traditional tools:
We're able to work at a whole higher level of abstraction that's more around curiosities and insights and questions and is much less rigid than anything we've been able to work with before.
It can also accelerate the creation of more low-quality and misleading graphs and charts. It makes it easier for users to misuse low-quality data gathered for another purpose, misinterpret data labels, and be too rushed to care.
They may also generate inefficient Python processing code that is expensive and possibly insecure to run in production. This is comparable to the danger of vibe coding, where novice coders can build great prototypes to test a concept, but which is not suitable for production. This working prototype can make it easier for the expert programmers to code the app or analytics that meets business and compliance requirements.
The bar for working analytics code is much lower since it allows users to explore many possibilities. This will enable them to share potential insight with colleagues much more quickly and allow colleagues to push back against incorrect assumptions or misinterpretations of data. Also, the open-source community aspect of sharing coding prompts makes it easier to reuse helpful prompts across scenarios, whereas with traditional analytics tools, it is much harder.
Building on open source analytics
Plotly’s leadership came from a scientific and engineering background, where they used open source code written in programming languages like Python and R to improve their workflow. The company’s tools leverage its extensive library of Python and R code as a starting point to guide LLMs in generating analytics code.
This provides a different starting point from traditional analytics vendors, adding AI on top of a closed, point-and-click architecture. The LLMs are used to write the code that does the analysis rather than analyzing the data directly. This helps sidestep issues with hallucinations and can support the ability to perform accurate calculations on large data sets. Parmer says:
I'm doing data analytics in English, now, instead of doing them in Python, and when I write five lines of English, it will generate three hundred lines of Python code that will do what I expressed. And so that's a whole order of magnitude difference in terms of the level, the speed that I can do data analytics, correct things, follow my curiosities, as well as the amount of analytics that I can do in a day.
Also, this approach ensures verifiability since the output is human-readable Python code. Users or their colleagues can see exactly what the code is doing to troubleshoot any issues. The tools also take a spec-driven development approach. Users are guided to fill out a natural language specification that describes the desired analysis and visualization. A more detailed spec reduces the amount of guesswork the LLM has to fill in, improving reproducibility.
This helps overcome a traditional trilemma in data analytics tools where users had to choose between power, accessibility, and verifiability. Plotly’s approach improves accessibility through natural language and leverages the full power of a programming language while ensuring verifiability of human-readable code.
Turbo-charging serious play
Michael Schrage, Research Fellow at the MIT Sloan School's Initiative on the Digital Economy, has been championing the importance of “serious play” for discovering business insights for years. He has recently taken an interest in vibe analytics to enable this process.
He observes that vibe coding produces code artifacts, while vibe analytics generates interpretations and testable insights. The output is actionable business insights or new testable hypotheses about the business. It also changes the fundamental questions that analytics can answer. Schrage frames it like this:
The Excel spreadsheet era asked, ‘What happened?’ The dashboard era asked, ‘Why did it happen?’ The vibe era asks, ‘What insights emerge if we explore together?’
This promises to re-frame data analysis from a purely deductive forensic activity to a creative and generative one. Schrage argues the goal shifts from producing an artifact to engaging in a dialogue with the information:
Instead of code as an artifact, I'm interested in the insight. I am interested in engaging with that. I'm interested in improvising with that data and seeing what emerges, what kind of insight, better yet, actionable insight, emerges.
It's not just about finding the correct answer more quickly, but also about generating new hypotheses and discovering questions they did not consider when they started. Examples of this include:
- Turning KPIs into conversation partners where leaders can engage with them to discover causes and debate assumptions.
- Synthesizing a disconnected data story that would be difficult to connect manually.
- Improvising with data personal like high-churn customers or profitable but low volume products to better understand their characteristics, behaviors and factors that influence them.
Easier prototypes vs. faster production
Vibe analytics has distinct value propositions for expert versus novice users. Expert data scientists and analysts can write code more efficiently and quickly. Meanwhile, product managers, subject matter experts and business users can experiment with more sophisticated analytics techniques without waiting for a centralized data team. Parmer notes:
You've got designers and product managers and folks that have never really been able to code before, using code as a means to an end, to explore an idea, to create a prototype.
For example, a marketing director might generate a prototype of a customer segmentation model. This might be production-ready or computationally efficient, but it serves as a specification that can be handed off to a data engineering team to harden and scale. This helps shift the conversation from abstract requirements to concrete working examples.
Exacerbating good and bad
Lowering the bar for sophisticated analytics can also create distinct new risks. Vibe coding can magnify human cognitive habits. A diligent and curious user can experiment with deep investigation much more quickly. However, intellectually lazy users could become more mindlessly productive, generating charts that look authoritative but lack substance more quickly. Schrage warns:
My experience with my students and with my clients has been that it brings out the best in you and it amplifies the worst in you, whatever your cognitive style may be. The strengths become stronger. The weaknesses can, excuse my vulgar language, f*** you over really fast.
This amplification can also apply to data quality issues. For example, generated analytics code might flawlessly execute a command to “graph revenue.” Still, if the data set contains ambiguous columns, it might ingest the wrong data set without the user noticing. Parmer notes that the same data quality problems in Excel spreadsheets apply here, but potentially at a larger scale. He argues that as creation costs drop, it's also important to uplevel the review process.
Vibe analytics has the potential to exacerbate the tension between business goals of exploiting data to get quick answers and exploring data to find new questions. The low-hanging fruit is to make it easier to generate a new chart for a slide deck. But Schrage believes the real value lies in lowering the bar for following a chain of curiosity without hitting technical roadblocks.
My take
The combination of simpler tools for complex data exploration and analysis and the supporting tools might help usher in a shift in which the barrier to entry shifts from knowing syntax (e.g., Python, SQL, Excel formulas) to knowing semantics (what to ask and how to interpret answers).
Plotly’s approach shows one way to approach this problem without trying to hide the code. It demonstrates a path for using LLMs to simplify the user experience while also supporting governance processes that ensure new analytics are robust, verifiable, and compliant.
It also got me thinking about how similar approaches might support better simulation and predictive analytics in the future. For example, what if instead of just looking at dashboards, you could spin up a version of Sim City or Transport Tycoon customized to the vagaries of your business to explore the implications of new business strategies?
That, of course, will be no easy task, since many of these older games used approaches better suited for faster gameplay than for discovery. But what if “vibe simulation” let you play your business or explore different government policies like a game, so we could find better ways to all win together? Maybe that’s the next iteration of Schrage’s vision of serious play.