“AI’s Ostensible Emergent Abilities Are a Mirage” paper won the Outstanding Paper Award at NeurIPS 2023 (nitter.net)

submitted 11 months ago* (last edited 11 months ago) by ylai@lemmy.ml to c/artificial_intel@lemmy.ml

17 comments fedilink hide all child comments

Previous Lemmy.ml post: https://lemmy.ml/post/1015476 Original X post (at Nitter): https://nitter.net/xwang_lk/status/1734356472606130646

you are viewing a single comment's thread
view the rest of the comments

[-] keepthepace@slrpnk.net 5 points 11 months ago

That's a weird definition. Is it a widely used one? To me emergence meant to acquire capabilities not specifically trained for. I don't see why them appearing suddenly or linearly is important? I guess that's an argument in safety discussions?

[-] jacksilver@lemmy.world 4 points 11 months ago

That definition is based on how the paper approached it and seems to be a generally accepted definition. I just read a bit of the paper, but seems to highlight that how we've been evaluating LLMs has a lot more to say about their emergent capabilities than any actual outcome.

[-] huginn@feddit.it 3 points 11 months ago

Not only that but it's the definition used by every single researcher claiming "Emergent behavior"

[-] keepthepace@slrpnk.net 1 points 11 months ago

Ok thanks.

[-] kaffiene@lemmy.world 2 points 11 months ago* (last edited 11 months ago)

That was my feeling reading the paper. I feel that LLMs are overhyped but the issue of linear vs super linear growth in metrics is a different issue and can't be a refutation of what has traditionally been thought of as emergent properties. In other words, this is refutation by redefinition.

[-] Mahlzeit@feddit.de 2 points 11 months ago

It's not the definition in the paper. Here is the context:

The idea of emergence was popularized by Nobel Prize-winning physicist P.W. Anderson’s “More Is Different”, which argues that as the complexity of a system increases, new properties may materialize that cannot be predicted even from a precise quantitative understanding of the system’s microscopic details.

What this means is, that we cannot, for example, predict chemistry from physics. Physics studies how atoms interact, which yields important insights for chemistry, but physics cannot be used to predict, say, the table of elements. Each level has its own laws, which must be derived empirically.

LLMs obviously show emergence. Knowing the mathematical, technological, and algorithmic foundation, tells you little about how to use (prompt, train, ...) an AI model. Just like knowing cell biology will not help you interact with people, even if they are only colonies of cells working together.

The paper talks specifically about “emergent abilities of LLMs”:

The term “emergent abilities of LLMs” was recently and crisply defined as “abilities that are not present in smaller-scale models but are present in large-scale models; thus they cannot be predicted by simply extrapolating the performance improvements on smaller-scale models”

The authors further clarify:

In this paper, [...] we specifically mean sharp and unpredictable changes in model outputs as a function of model scale on specific tasks.

Bigger models perform better. An increase in the number of parameters correlates to an increase in the performance on tests. It had been alleged, that some abilities appear suddenly, for no apparent reason. These “emergent abilities of LLMs” are a very specific kind of emergence.

this post was submitted on 12 Dec 2023

64 points (100.0% liked)

AI

4141 readers

1 users here now

Artificial intelligence (AI) is intelligence demonstrated by machines, unlike the natural intelligence displayed by humans and animals, which involves consciousness and emotionality. The distinction between the former and the latter categories is often revealed by the acronym chosen.

founded 3 years ago