189

Honestly an AI firm being salty that someone has potentially taken their work, "distilled" it and selling that on feels hilariously hypocritical.

Not like they've taken the writings, pictures, edits and videos of others, "distilled" them and created something new from it.

you are viewing a single comment's thread
view the rest of the comments
[-] brucethemoose@lemmy.world 73 points 5 months ago* (last edited 5 months ago)

This is a lie.

Some background:

  • LLMs don't output words, they output lists of word probabilities. Technically they output tokens, but "words" are a good enough analogy.

  • So for instance, if "My favorite color is" is the input to the LLM, the output could be 30% "blue.", 20% "red.", 10% "yellow.", and so on, for many different possible words. The actual word thats used and shown to the user is selected through a process called sampling, but that's not important now.

  • This spread can be quite diverse, something like:

  • A "distillation," as the term is used in LLM land, means running tons of input data through existing LLMs, writing the logit outputs, aka the word probabilities, to disk, and then training the target LLM on that distribution instead of single words. This is extremely efficient because running LLMs is much faster than training them, and you "capture" much more of the LLM's "intelligence" with its logit ouput rather than single words. Just look at the above graph: in one training pass, you get dozens of mostly-valid inputs trained into the model instead of one. It also shrinks the size of the dataset you need, meaning it can be of higher quality.

  • Because OpenAI are jerks, they stopped offering logit outputs. Awhile ago.

  • EG, this is a blatant lie! OpenAI does not offer logprobs, so creating distillations from thier models is literally impossible.

  • OpenAI contributes basically zero research to the open LLM space, so there's very little to copy as well. Some do train on the basic output of openai models, but this only gets you so far.


There are a lot of implications. But basically a bunch of open models from different teams are stronger than a single closed one because they can all theoretically be "distilled" into each other. Hence Deepseek actually built on top of the work of Qwen 2.5 (from Alibaba, not them) to produce the smaller Deepseek R1 models, and this is far from the first effective distillation. Arcee 14B used distilled logits from Mistral, Meta (Llama) and I think Qwen to produce a state-of-the-art 14B model very recently. It didn't make headlines, but was almost as eyebrow raising to me.

[-] Sekoia 6 points 5 months ago

Wait, so OpenAI's whole kerfuffle here is based on nothing directly stated (e.g. in the paper like I thought), and worse, almost certainly completely unfounded?

Wow just when I thought they couldn't get more ridiculous...

[-] brucethemoose@lemmy.world 14 points 5 months ago* (last edited 5 months ago)

Almost all of OpenAI's statements are unfounded. Just watch how the research community reacts whenever Altman opens his mouth.

TSMC allegedly calling him a "podcast bro" is the most accurate descriptor I've seen: https://www.nytimes.com/2024/09/25/business/openai-plan-electricity.html

load more comments (8 replies)
this post was submitted on 29 Jan 2025
189 points (100.0% liked)

Leopards Ate My Face

7185 readers
206 users here now

Rules:

  1. The mods are fallible; if you've been banned or had a post/comment removed, please appeal.
  2. Off-topic posts will be removed. If you don't know what "Leopards ate my Face" is, try reading this post.
  3. If the reason your post meets Rule 1 isn't in the source, you must add a source in the post body (not the comments) to explain this.
  4. Posts should use high-quality sources, and posts about an article should have the same headline as that article. You may edit your post if the source changes the headline. For a rough idea, check out this list.
  5. For accessibility reasons, an image of text must either have alt text or a transcription in the post body.
  6. Reposts within 1 year or the Top 100 of all time are subject to removal.
  7. This is not exclusively a US politics community. You're encouraged to post stories about anyone from any place in the world at any point in history as long as you meet the other rules.
  8. All Lemmy.World Terms of Service apply.

Also feel free to check out !leopardsatemyface@lemm.ee (also active).

Icon credit C. Brück on Wikimedia Commons.

founded 2 years ago
MODERATORS