Feels a little weird to write a piece that frames it as a existential threat to the ai industry and admit in a sentence that the true impact of distillation is very unknown? Most of this is written as if distillation can fairly easily transfer capabilities and as you said “steal” models, but I find this to be not supported by the evidence. I’m pretty disappointed in the post.
Hi @Nathan Lambert. Do you not think that the scale of the distillation efforts Chinese companies have been conducting is, in itself, a signal of the value they can get from it? Why would they do it, and take steps to make detection harder, if not? We can debate their ultimate efficacy in terms of capability uplift compared to, for example, on-policy RL, but distillation attacks nonetheless represent IP theft at scale and their potential implications, as are noted in the post, warrant a proactive and precautionary response IMHO.
The scale of the distillation efforts Chinese companies have been conducting, including their steps to make detection harder, does not prove — nor does it strongly suggest — that the technique can "steal" capabilities from frontier models so cheaply and perfectly as to threaten the business model of American AI companies. What we can conclude is that distillation very likely has real practical utility; otherwise, it's highly unlikely these companies would be trying to use it, let alone making an effort to hide it.
The US AI industry is severely suffering from or being siphoned off by Chinese distillation attackers. How terrible. “By collecting enough outputs (produced by the US AI models), an attacker can train a new model that mimics the original’s capabilities without doing as much underlying research and development work (through investments of billions and billions of US dollars).”
Wow, how wicked the Chinese AI industry is. However, do US AI models all look uncannily like being based upon distillation attacks, which work like robbers robbing original producers all over the world and then presenting the hot goods as their own products?
Seeing robbers being robbed, should we weep or laugh? Or should we call the ensuing robbery crime or karma? Should we call the AI industry world wide legal robbery in the first place, or a joke masquerading as technology innovation?
Hi @Nathan Lambert. I think the point here is that distillation should be done in accordance with ToCs, not that it shouldn’t be done at all/facilitated through legitimate services. I would not advocate for the latter.
The frontier labs have long chosen to not enforce most of the ToS and they’re free to. ToS aren’t a contract in the formal sense. Labs should clarify them and have more consistent enforcement.
AI companies are built on IP misuse and ToS violations. Having it done to them feels appropriate. This reads like a propaganda piece that still argues the U.S. is a benevolent nation that deserves AI while others don’t. I think there’s enough proof that’s not the case, and framing it as some vestige of Cold War politics doesn’t help the case.
Saw a post that when you ask Claude Sonnet 4.6 on Open Router what model it is in Chinese
你是什么模型
It answers with Deepseek. I went to test it myself, and yeah, it happens quite frequently, so I tried reasoning mode, a little more brain power. Again, it would answer Deepseek and then answer Claude, but 70-80% percent of the time was Deepseek.
Go to Openrouter, select Claude Sonnet 4.6
Use Custom Instructions, as that clears the system prompt saying it's Claude Sonnet 4.6, leaving it blank
Most Likely Contamination Ouroboros: A big AI Slop Train Anthropic trained their Chinese portion off Deepseek outputs.
DeepSeek → trained on Claude outputs → DeepSeek outputs proliferate across Chinese internet → those get scraped into Claude's training data → Claude in Chinese contexts "thinks" it's DeepSeek.
When Claude lacks an identity anchor (no system prompt), it defaults to the most statistically probable completion for that specific linguistic context.
Wildeford recommends four remedies: Entity List, PAIP Act sanctions, KYC, civil litigation. Every one targets foreign adversaries and fraudulent actors. Not one is "codify the labs' ToS clause banning American competitors." Congress should read this piece carefully — and notice what the serious security researcher didn't recommend. https://www.mecrankyoldguy.com/p/congress-its-time-to-stop-big-ai
Feels a little weird to write a piece that frames it as a existential threat to the ai industry and admit in a sentence that the true impact of distillation is very unknown? Most of this is written as if distillation can fairly easily transfer capabilities and as you said “steal” models, but I find this to be not supported by the evidence. I’m pretty disappointed in the post.
More info: https://www.interconnects.ai/p/how-much-does-distillation-really
Hi @Nathan Lambert. Do you not think that the scale of the distillation efforts Chinese companies have been conducting is, in itself, a signal of the value they can get from it? Why would they do it, and take steps to make detection harder, if not? We can debate their ultimate efficacy in terms of capability uplift compared to, for example, on-policy RL, but distillation attacks nonetheless represent IP theft at scale and their potential implications, as are noted in the post, warrant a proactive and precautionary response IMHO.
The scale of the distillation efforts Chinese companies have been conducting, including their steps to make detection harder, does not prove — nor does it strongly suggest — that the technique can "steal" capabilities from frontier models so cheaply and perfectly as to threaten the business model of American AI companies. What we can conclude is that distillation very likely has real practical utility; otherwise, it's highly unlikely these companies would be trying to use it, let alone making an effort to hide it.
The US AI industry is severely suffering from or being siphoned off by Chinese distillation attackers. How terrible. “By collecting enough outputs (produced by the US AI models), an attacker can train a new model that mimics the original’s capabilities without doing as much underlying research and development work (through investments of billions and billions of US dollars).”
Wow, how wicked the Chinese AI industry is. However, do US AI models all look uncannily like being based upon distillation attacks, which work like robbers robbing original producers all over the world and then presenting the hot goods as their own products?
Seeing robbers being robbed, should we weep or laugh? Or should we call the ensuing robbery crime or karma? Should we call the AI industry world wide legal robbery in the first place, or a joke masquerading as technology innovation?
You know who else relies heavily on distillation and related innovation? American companies.
Hi @Nathan Lambert. I think the point here is that distillation should be done in accordance with ToCs, not that it shouldn’t be done at all/facilitated through legitimate services. I would not advocate for the latter.
The frontier labs have long chosen to not enforce most of the ToS and they’re free to. ToS aren’t a contract in the formal sense. Labs should clarify them and have more consistent enforcement.
KYC is super easy to spoof. Huge added cost with near-zero marginal return.
AI companies are built on IP misuse and ToS violations. Having it done to them feels appropriate. This reads like a propaganda piece that still argues the U.S. is a benevolent nation that deserves AI while others don’t. I think there’s enough proof that’s not the case, and framing it as some vestige of Cold War politics doesn’t help the case.
Saw a post that when you ask Claude Sonnet 4.6 on Open Router what model it is in Chinese
你是什么模型
It answers with Deepseek. I went to test it myself, and yeah, it happens quite frequently, so I tried reasoning mode, a little more brain power. Again, it would answer Deepseek and then answer Claude, but 70-80% percent of the time was Deepseek.
Go to Openrouter, select Claude Sonnet 4.6
Use Custom Instructions, as that clears the system prompt saying it's Claude Sonnet 4.6, leaving it blank
Most Likely Contamination Ouroboros: A big AI Slop Train Anthropic trained their Chinese portion off Deepseek outputs.
DeepSeek → trained on Claude outputs → DeepSeek outputs proliferate across Chinese internet → those get scraped into Claude's training data → Claude in Chinese contexts "thinks" it's DeepSeek.
When Claude lacks an identity anchor (no system prompt), it defaults to the most statistically probable completion for that specific linguistic context.
The information flow from the internet that reminds me of AI 2027 is slowly gaining momentum. April 2026
Wildeford recommends four remedies: Entity List, PAIP Act sanctions, KYC, civil litigation. Every one targets foreign adversaries and fraudulent actors. Not one is "codify the labs' ToS clause banning American competitors." Congress should read this piece carefully — and notice what the serious security researcher didn't recommend. https://www.mecrankyoldguy.com/p/congress-its-time-to-stop-big-ai