Flo Health CTO reveals how data lakes boost women's health app performance
- Summary:
- As part of the diginomica network content series, Flo Health's CTO Roman Bugaev explains how the women's health unicorn consolidated two petabytes of user data with Databricks' data lake platform in 2022, enabling 200 simultaneous experiments while maintaining strict privacy compliance for 80 million users worldwide.
Wearable health devices aim to empower their users with deep and immediate insights. Those insights require powerful data analytics. A slim ring or fitness tracking watch is not capable of the levels of analytics required to become pregnant or remain strong and healthy; that level of analytics has to take place on an enterprise cloud computing-powered estate with data lakes.
Facing just these problems, female healthtech unicorn Flo Health began implementing and working with Databricks in 2022, and Chief Technology Officer Roman Bugaev describes to diginomica the data demands of their App, the move to Databricks, and the business benefits he is seeing, data privacy, and the next generation of data technologies and what they will provide to women.
Recently, a wave of new wearable devices has entered the market. Alongside the almost ubiquitous Fitbit and Apple watches are Whoop bracelets and Samsung rings. These are not only more desirable to some users, but as Bugaev says, they have improvements in sensor technology, providing a company like Flo Health with improved data on body temperature, whilst previous era devices were really just step counters. A distinct advantage of these new devices, he says, is that they are wearable at night:
You need to collect data at night when the body is in a stable environment, and the user is not distracted by activities, so you need something that is more comfortable than an Apple Watch or Garmin.
Steady state data provides Flo Health and its users with greater precision. The data is used to monitor and make choices about periods and overall health. Flo Health delivers personalized cycle and ovulation tracking to 80 million women around the world. The App provides symptom monitoring, health information from 120 doctors, and a private community for wellness discussions.
In 2022, as Flo Health was experiencing rapid growth, it realized the existing disparate set of databases and technologies was inhibiting the service it offered, and the future potential of the organization. Bugaev says:
We struggled to find one storage solution that would be useful for every use case, so we had different databases of temperature data and symptoms in a different format and database, so we had no single view of the data.
Flo Health had traditional databases, distributed Structured Query Language (SQL), and PostgreSQL, and was struggling to increase these as data volumes grew. Inevitably, this led to data and team silos. With over two petabytes of data, a new approach was required.
Brick by brick
In 2022, Flo Health selected the Databricks Data Intelligence Platform from a contest of two data lake providers. A data lake approach would consolidate the various data repositories and simplify data access across the organization, which in turn would improve data security. The Chief Technology Officer says:
Our previous systems couldn't handle the scale and complexity of our data.
This was important as users demand rapid and accurate insights. The devices are not powerful enough to analyze personal and cohort data, so server-side data analytics is vital for speed and useful insights. In addition the App is used in geographies where the user is not always on the latest device. Bugaev says:
On the server-side, it is easier, as the data is centralized and optimized so the experience is seamless anywhere in the world. Latency is very important to us. With this platform, we can provide cycle predictions very quickly.
Implementation of the Databricks Data Intelligence Platform was done in-house and took three months.
Flo Health reports that the platform is now enabling them to run between 150 and 200 experiments simultaneously, which form part of the insights and service they offer to users of the App. Part of that increase in performance is down to the democratization of data across Flo Health. Databricks is used by 46% of staff for engineering, data science, and business analysis, most of whom are using the Looker dashboard that is part of the platform.
Flo Health has adopted a wide range of Databricks tools as part of its move to a central data platform and data lake, including the use of Databricks Assistant for improvements to SQL queries, and a Copilot program that enables product managers and analysts to use SQL. New product and feature releases increased as a result. These, the Chief Technology Officer says, have led to good adoption rates of Databricks at Flo Health, with monthly active user rates of 45% and weekly active user rates of 57%.
Privacy
Pregnancy and periods are very personal. Openness and discussion can vary greatly between individuals and cultures and App users will expect a high degree of respect for privacy. Knowing this, as part of the platform implementation, Flo Health adopted the Unity Catalog from Databricks to manage structured and unstructured data and to strengthen the firm's data governance within workflows. This ensures that no Personally Identifiable Information (PII) is available to staff, and it serves the Anonymous Mode.
Flo Health hosts the Databricks platform on Amazon Web Services (AWS) in the US, the London-based Chief Technology Officer explains:
We keep all the data in the United States. With many cloud providers, new advancements come out in the United States first, and then they are distributed around the world. We don't want to wait. With Databricks, we don't have that issue; everything is made available on day one.
In August, Mark Zuckerberg's Meta, home of Facebook, was found liable in a California federal court for using period tracking data from Flo Health to direct targeted advertising. The California Invasion of Privacy Act and the Confidentiality of Medical Information Act lawsuit was filed by five women against Meta, Google, and Flo Health for the use of personal health data between 2016 and 2019. Flo Health settled the claims with the plaintiffs in July 2025 and denied the claims.
Data privacy will increase in importance for Flo Health and all health tech providers as they, like Flo Health, begin using Artificial Intelligence (AI) and Machine Learning (ML). Flo Health is using the Databricks MLflow observability technology to manage its machine learning lifecycles to ensure the organization always knows the versions and deployments of ML.
AI agents are being assessed and tested. Bugaev is cautiously optimistic about the role of this technology in health:
It will take time, as what we deal with is so much more sensitive. To fine-tune models we rely on synthetic data, and we won't use a user's data. So there is an added layer of complexity for privacy, and you have to make sure that you are medically correct and safe.
Bugaev adds that Flo Health is therefore having to invest in ML benchmarks to measure the efficiency of ML for quality. But, he believes these technologies will become vital to the Femtech Apps market as they will enable greater personalization of information and proactive response to the needs of the user, as well as a greater diversity of languages. The Chief Technology Officer says:
Technology in the right hands produces an enormous amount of value and you can empower a lot of users.
Bugaev believes this will impact which medical tech firms can realize the benefits of AI and ML:
You need to be of a certain size to afford all of this extra work, and you will need separate teams for security, compliance and privacy.
My take
Technology has empowered women, and it is important that as we enter the AI age, data represents women, protects women, and further empowers them. Mistakes have been made by Flo Health, but this partnership will be watched to ensure its users benefit and not Mark Zuckerberg.