World Models, Distillation Wars, and the Inevitable Convergence

Last month, Anthropic accused DeepSeek, Moonshot AI, and MiniMax of running "industrial-scale" distillation campaigns against Claude. 24,000 fraudulent accounts. 16 million exchanges. All designed to extract capabilities and feed them into Chinese models. OpenAI filed similar allegations, claiming DeepSeek used proxies to bypass geo-restrictions and harvest ChatGPT outputs.

The response from the AI community? Split right down the middle.

One side called it theft. IP violation. A national security threat. The other side pointed out the irony: Western AI companies trained their models on the entire internet's copyrighted content without asking, and now they're upset someone is training on their outputs.

Both sides have a point. Both are also missing the bigger picture.

Language models are already the wrong battlefield. The actual next step is world models: AI systems that don't just process text, but understand how physical reality works. Gravity, cause and effect, object permanence, spatial relationships. The intuitive sense that lets you know a ball will hit the ground before you drop it.

Yann LeCun, Fei-Fei Li, Google DeepMind, NVIDIA, and labs across China and the UAE are all racing to build them. LeCun has said that within three to five years, world models will be the dominant AI architecture, and nobody in their right mind would still use today's LLMs.

Building one won't happen on chatbot transcripts or distilled reasoning chains. It needs video, sensor data, 3D spatial information, physics simulations, and real-world interaction data from every environment on the planet. Every terrain, every climate, every physical interaction between objects, humans, and machines. The data requirements make LLM training sets look like a pamphlet.

So here we are, fighting over copied chatbot outputs, while the actual destination requires something so massive no single company, no single country, could build the dataset alone.

Nobody in this conversation seems to want to say that part.

If the end state of AI is a system that truly understands reality, that can simulate every corner of the physical world, then one nation's language model outputs being sacred territory starts to look small. Not because IP doesn't matter today. It does. But the scale of what's coming makes today's distillation wars look like neighbours arguing over a fence line while a city is being built around them.

A world model that works needs inputs from every language, every climate, every physical system on earth. No single lab will ever have that on its own. Which means the frame of "stolen outputs" gets smaller the further out you look.

We're nowhere near that. Probably not in our lifetime. But every headline about "stolen" training data is really a preview of a much bigger question nobody is ready to answer:

What happens when the AI that understands reality doesn't belong to anyone?