AI policy – does data sovereignty really exist? Nope.
- Summary:
-
Recent news stories have called into question the very concept of data sovereignty. This needs urgent consideration, as countries ponder 'sovereign AI'.
Anyone believing we are living in a golden age of Artificial Intelligence (AI) in the cloud might be in for a shock from recent news stories. Among them, Microsoft – whose SharePoint system was recently breached at client level by Chinese threat actors, and which has laid off 9,000 staff – has admitted that it cannot guarantee the sovereignty of data held outside America under the US Cloud Act.
The admission was made by Director of Public and Legal Affairs for Microsoft France, Anton Carniaux, and Technical Director of the Public Sector, Pierre Lagarde, during a hearing before the French Senate – at a time when many nations, such as the UK, are pursuing 'sovereign AI' strategies based on national data sets.
In fact, the law affects data held by, or shared with, all US cloud companies and hyperscalers – including Google, Amazon, Meta, Apple, X.ai, et al – on behalf of clients in the European Union (EU), UK, and elsewhere, even if that data is held in European data centers.
Separately, OpenAI Chief Executive Officer Sam Altman has admitted that there is no duty of client confidentiality covering any 'conversation' a user might have with ChatGPT – including one-to-one legal, consultancy, healthcare, or therapy sessions. He made the admission this month in an interview with podcaster Theo Von.
Consider the implications of that: if you are a user of any Generative Pre-trained Transformer (GPT)-based model, then assume that any personal, private, or confidential data you share with your chatbot is never treated as such. Indeed, you are sharing secrets with a company whose Chief Executive Officer was, two years ago, fired by his own board for an unspecified breach of trust.
Granted, deleted chats on ChatGPT are permanently erased within 30 days, according to OpenAI's user policy – unless the company is obliged to keep them for legal or security reasons. But in Trump v.20, who can say what legal obligations might be placed on AI providers?
More pertinently, a lot can happen in 30 days. And it is known that while data can be erased from a server, it cannot be deleted from a Large Language Model's (LLM) weights. Indeed, the original data might be inferred, probabilistically, from them.
Recently I closed my account with Otter AI after the cloud-based transcription platform became infected with new GPT-based AI functions, putting confidential interviews and conversations at risk of being scraped by ChatGPT and used for training data.
As my report on the platform last year revealed, Otter had also begun flying in data from unknown, external sources and crediting it to speakers, making the tool completely untrustworthy for transcribing interviews, meetings, and chats.
Now apply that principle to Cabinet discussions.
Government data at risk from AI partnerships
OpenAI has just signed a strategic partnership deal with one of the world's biggest IT users, the British Government, which must call into question the confidentiality and privacy of any data that Whitehall departments share with the company – particularly if the US President demands access for national security reasons.
Anecdotal testimony from ChatGPT users reveals that, when used to record meeting minutes, the tool often misunderstands points or misreports them. So, just imagine the damage it could do, unsupervised, to government policymaking.
And that is not all. In a recent conversation with Federal Reserve Vice Chairman for Supervision Michelle Bowman, Altman also admitted that some categories of jobs will be "just like totally, totally gone" as they are replaced by AI agents – chatbots that will also feed data back to OpenAI.
Do governments have an action plan for that scenario, or for the wholesale replacement of junior positions? Or only for seizing the apparent opportunities of AI, as promised by vendors in the hype cycle?
But of course, OpenAI is not the only culprit. Earlier this year, Meta announced it will be using data from EU adults to train its systems, rather than limiting itself to US-based content. Indeed, 'adult content' is revealed to have two meanings for the social media giant: last week, adult filmmaker Strike Three Holdings launched a $100 billion lawsuit against the company for alleged use of its videos to train Meta's AI.
Sir Demis Hassabis, British Chief Executive Officer of Google DeepMind, said the quiet part out loud in a recent interview. Hassabis explained that he is not worried about running out of human data, even as synthetic content and AI slop prevails. But that is because AI companies have scraped the Web of all data held in public domains – including copyrighted works, which in some cases have been lifted at scale from known pirate resources. Little of that data has been obtained with consent, credit, or remuneration for its creators or rightsholders.
In Google's case, copyrighted data is now being walled off in AI search and rented back to us, with fewer and fewer visitors referred to external data sources.
Take all these issues together and one thing is clear: far from living in an AI golden age, we are trapped in a realm of industrialized copyright theft and data laundering, one that amounts to a coup on all the world's digitized content, including our private and confidential discussions.
The revolving door between Big Tech and government
Indeed, where wealthy individuals and nations once hoarded gold as a hedge against economic downswings, in the future people are likely to hoard verified, pre-2023 data as a hedge against untrustworthy AI and its many hallucinations and errors. Arguably, the Web is already broken as an information platform, and verified external sources are likely to begin collapsing in the face of Big Tech's AI onslaught.
So, it is hardly surprising that, beneath the relentless AI hype – much of which is designed to keep investors buying into a belief-based market – there is a groundswell of real anger at some AI vendors' arrogance, entitlement, and cynicism, and their cavalier attitude to everyone's intellectual property except their own.
As one of my reports last week revealed, the public – in Britain, at least – are eight to one in favor of vendors paying for the use of copyrighted content in training data. Three-quarters (74%) of respondents to a Politico survey believe vendors should pay up, with just eight percent saying data should be free to companies that, in several cases, are worth trillions of dollars.
But this begs the question of who is in favor of UK Government proposals to change fair-use conventions and opt creators into AI training by default? Not the public, it seems, nor the media, the UK's $160 billion creative industries, the House of Lords, nor even the UK's AI companies. Trade organization UK AI (UKAI) labeled the plan unworkable, dangerous, misguided, and divisive in a coruscating report earlier this year.
So, who is in favor, apart from US Big Techs? Step forward self-styled 'thinktank' the Tony Blair Institute for Global Change (TBI), which has been lobbying hard for the change. Its 2023 financial statement – in the public domain on Companies House – reveals that among its financial sponsors are some major Big Tech names with a lot of skin in the AI game.
Lobbying for influence over policy is a part of political life and debate, of course, all around the world. That's not going to change in an AI age. That said, the door to Number 10 seems welcomingly open to Big Tech.
For example, Max Beverton-Palmer – now head of UK Public Policy at the world's most valuable company, NVIDIA – was in Downing Street last week advising on AI policy.
Beverton-Palmer spent four years as Director of the Internet Policy Unit at the Tony Blair Institute, and before that a year as its Head of Tech and Society. A short hop and a skip to Meta, where he was under contract for five months running Public Policy and Campaigns, he was in Downing Street as NVIDIA "co-hosts" the first Sovereign AI Industry Forum.
Fancy that! as satirical magazine Private Eye might say.
This week, the sense that the Tony Blair Institute is the real power behind the throne of UK AI policy, backed by US Big Tech money, deepened with the publication of a new report – Sovereignty, Security, Scale: A UK Strategy for AI Infrastructure, which urges Britain to build a new compute infrastructure so that major vendors can run their systems here. No doubt the likes of OpenAI – which has just entered a strategic partnership with the government – would love nations to balance vendors' books on capex, conceivably at the public expense, as soon as possible.
My take
So, 'sovereign AI': does it really exist? Answers on a postcard from the Senate in Paris.