200
OpenAI to confidentially file for IPO as soon as Friday
(www.cnbc.com)
This is a most excellent place for technology news and articles.
Their models may also be based on US models.
It'll be hard for derivative models to innovate if their host organism has died.
Distillation isn't stealing the original model, though. It just uses the models to make synthetic training data to train their own thing. They aren't stealing the model itself.
Plus, a lot of companies do it. Anthropic's Claude was calling itself DeepSeek for a while.
It also doesn't seem like as big a deal as Anthropic and Open AI make it look, IMO. Them treating it like a national security issue where the company gets its models stolen from under its nose just comes across like a media company claiming that every download is a copy they would otherwise have sold at full price, and thus they have accrued trillions of dollars in damages.
I could, in theory, take a bunch of google Gemini outputs, and train a GPT-2 model on them. That doesn't mean that I've recreated Gemini, nor does it mean that i've stolen it from Google, either.
To top it all off, it's not like their services were abused. The companies were presumably paid appropriately for the usage.
I don't understand how anyone can keep a straight face when they hear an AI company crying about another AI company "stealing" from them while they go before lawmakers and argue that if they weren't allowed to steal stuff, AI wouldn't exist... I immediately picture the Always Sunny meme "oh, did someone get addicted to crack" Crying motions.