Main content

PyTorch Foundation adds Helion and Safetensors - and the open AI stack gets a little harder to ignore

Alyx MacQueen Profile picture for user alex_lee April 8, 2026
Summary:
Mark Collier briefed me on two updates under embargo at KubeCon Europe 2026 last month: Helion, which opens up GPU kernel programming to a far wider pool of developers, and Safetensors, which fixes a security problem in open source AI that was stubbornly overlooked for too long. The embargo's lifted - here we go.

Mark Collier of PyTorch at KubeCon Europe © CNCF Events
Mark Collier of PyTorch at KubeCon Europe

In a one-to-one conversation at KubeCon Europe 2026 - where we covered governance, hardware neutrality, and the agent layer - Mark Collier, Executive Director of the PyTorch Foundation and GM of AI at the Linux Foundation, also briefed me on two new foundation projects under embargo. Both arrive at a moment when the industry's appetite for deploying open weight models is running well ahead of its readiness to do so safely. Helion and Safetensors are the clearest possible expression of his guiding principle for the foundation's expansion: new projects should make the stack more secure, more performant, or easier to use - not just bigger.

Helion - opening up the dark art of GPU programming

To understand why Helion matters, it helps to know how GPU programming actually works - because it is considerably more involved than most people realize.

When an AI model runs on a GPU, the actual computation happens through something called a kernel: a tiny, highly optimized piece of code that tells the GPU precisely what mathematical operations to perform. Writing a good kernel requires understanding the specific memory architecture of the chip, how it parallelizes work, and how to squeeze maximum throughput out of the hardware. It is genuinely arcane knowledge, and the number of people who have it can be counted in the hundreds globally.

Above the kernel layer sits Triton, a compiler developed by OpenAI that makes it possible to write kernels in a higher-level language rather than raw GPU assembly. Triton is a significant step forward, but as Collier put it when we spoke:

There's only a handful of people in the world that really know how to do that. So Helion is another layer above that. So you can express what you want the GPU to do in more of like a Python type language that more people know, millions of people know. So that's just another way to kind of make it more accessible for people to unlock the power of these crazy GPUs.

Helion is a Python-embedded domain-specific language (DSL) - think of it as a specialized dialect of Python designed specifically for describing GPU computations - that compiles down to Triton, TileIR, and more backends coming soon. The piece I find particularly interesting is the autotuning: rather than requiring a developer to manually figure out the optimal configuration for a kernel, Helion automatically tests hundreds of candidate implementations and finds the fastest one for the target hardware. That is not a minor convenience. On complex AI workloads, kernel performance can be the difference between a model that runs at scale and one that burns through GPU budget without delivering.

This matters especially right now because the AI accelerator market is fragmenting rapidly. AWS Trainium, Google TPUs, Cerebras, and a string of startups are all producing silicon that needs kernel-level optimization to actually deliver on its spec-sheet promises. The more hardware options exist, the more valuable a portable, accessible kernel language becomes - and the more painful it is that the current tooling requires rare specialist knowledge to use.

Helion comes from Meta, which has deep expertise in exactly this kind of infrastructure. It joins the foundation alongside ExecuTorch - also from Meta - which is moving into PyTorch Core and extends PyTorch model functionality to edge and on-device environments. Both decisions being made inside a neutral foundation rather than inside Meta's engineering org is, as I argued in the Collier piece, exactly the point.

Jana van Greunen, Director of PyTorch Engineering at Meta, described it well: Helion "brings kernel authoring into PyTorch - making it simpler, portable, and accessible to every developer," and joining the foundation "opens Helion to the broader hardware ecosystem, so developers write one kernel and it runs fast everywhere."

For enterprise teams trying to navigate a hardware landscape that looks nothing like it did two years ago, that is a meaningful promise.

Safetensors - fixing a security problem that was stubbornly overlooked

The Safetensors story is, if anything, more striking - because the problem it solves has been there for years.

When AI researchers and developers share model weights - the numerical parameters that encode everything a model has learned - they need a file format to do it in. Until recently, the dominant choice was Python's pickle format. Pickle is fast, flexible, and widely supported. It is also, from a security standpoint, a fairly alarming choice: a pickle file can contain arbitrary executable code, which runs automatically when the file is loaded. In practice, that means downloading a model from the internet and running it could execute code on your system that you never saw and never agreed to run.

Collier's reaction when I raised it was exactly right:

It's really become kind of an industry standard as a way to ensure that the weights that you download - the sort of model - is what it says it is. There's kind of provenance of the file and it doesn't have any sort of nefarious code hidden in there.

He also described it, accurately, as "the number one thing to make your security people freak out." I'd argue it should have been getting attention for a while. The reason it didn't - or rather, the reason it did but nothing changed - is that there was no neutral standard to replace it with. Individual organizations could adopt alternatives, but without a community-wide standard, switching meant accepting incompatibility with the rest of the ecosystem.

That is where Safetensors comes in. Developed by Hugging Face - which hosts the vast majority of open weight models that enterprises are now pulling into production - Safetensors acts as a structured table of contents for a model's data. It separates metadata from weights, validates what it finds, and crucially prevents any executable code from being embedded in the file at all. It also delivers real performance improvements for multi-GPU and multi-node deployments, because it's designed for efficient parallel loading rather than general-purpose serialization.

The project has already achieved significant adoption, driven largely by Hugging Face's position at the center of the open model ecosystem. But adoption without governance is fragile. Hugging Face could, in principle, change direction. The format could fork. Enterprise security teams need something more durable than "the leading model hub currently uses it." Moving Safetensors into the PyTorch Foundation gives it the neutral trademark, the transparent governance, and the institutional permanence that makes it a standard rather than just a popular choice.

Luc Georges, Co-Maintainer of Safetensors, and Lysandre Debut, Chief Open Source Officer at Hugging Face, captured the ambition: 

Safetensors joining the PyTorch Foundation is an important step towards using a safe serialization format everywhere by default. The new ecosystem and exposure the library will gain from this move will solidify its security guarantees and usability. We're still convinced we're at the very beginning of its lifecycle: the coming months will see significant growth, and we couldn't think of a better home for that next chapter than the PyTorch Foundation.

Matt White, Global CTO of AI at the Linux Foundation and CTO of the PyTorch Foundation, noted: 

Together with Helion, these contributions to the Foundation solidify the technical future for open source AI.

My take

When Collier briefed me on these two projects in Amsterdam, I found myself more excited about them than about a lot of the shinier announcements happening around KubeCon that week. Infrastructure nerdery is an acquired taste, but these are the kinds of problems that matter - not because they generate headlines, but because they are the difference between an open AI ecosystem that actually works in production and one that only works in the lab.

Helion is exciting because democratizing GPU programming is genuinely hard and genuinely important. The AI hardware market is about to get much more competitive, and the organizations that can take advantage of that competition are currently the ones with kernel engineers on staff. That is a very small club. Helion makes the club bigger.

Safetensors is exciting for a different reason - it is a reminder that the open source AI ecosystem has been moving so fast that it skipped some fairly basic security hygiene, and that fixing that requires exactly the kind of neutral coordination that a foundation exists to provide. The pickle problem was not a secret. It just needed a home.

Collier told me last week that new projects joining the foundation would be "things that make it more secure or more performant or easier to use." Both of these deliver on that. If you are making serious AI infrastructure decisions - they are worth understanding.

Disclosure - The CNCF covered the authors travel and accommodation expenses to attend KubeCon Europe 2026.

Loading
A grey colored placeholder image