LLMs can’t reason — they just crib reasoning-like steps from their training data : techtakes

[-] homesweethomeMrL@lemmy.world 57 points 9 months ago

Did someone not know this like, pretty much from day one?

Not the idiot executives that blew all their budget on AI and made up for it with mass layoffs - the people interested in it. Was that not clear that there was no “reasoning” going on?

[-] khalid_salad@awful.systems 34 points 9 months ago* (last edited 9 months ago)

Well, two responses I have seen to the claim that LLMs are not reasoning are:

we are all just stochastic parrots lmao
maybe intelligence is an emergent ability that will show up eventually (disregard the inability to falsify this and the categorical nonsense that is our definition of "emergent").

So I think this research is useful as a response to these, although I think "fuck off, promptfondler" is pretty good too.

[-] homesweethomeMrL@lemmy.world 20 points 9 months ago

“Language is a virus from outer space”

load more comments (1 replies)

load more comments (6 replies)

[-] froztbyte@awful.systems 29 points 9 months ago

there’s a lot of people (especially here, but not only here) who have had the insight to see this being the case, but there’s also been a lot of boosters and promptfondlers (ie. people with a vested interest) putting out claims that their precious word vomit machines are actually thinking

so while this may confirm a known doubt, rigorous scientific testing (and disproving) of the claims is nonetheless a good thing

[-] Soyweiser@awful.systems 13 points 9 months ago

No they do not im afraid, hell I didnt even know that even ELIZA caused people to think it could reason (and this worried the creator) until a few years ago.

[-] DarkThoughts@fedia.io 17 points 9 months ago

A lot of people still don't, from what I can gather from some of the comments on "AI" topics. Especially the ones that skew the other way with its "AI" hysteria is often an invite from people who know fuck all about how the tech works. "Nudifier" or otherwise generative images or explicit chats with bots that portray real or underage people being the most common topics that attract emotionally loaded but highly uninformed demands and outrage. Frankly, the whole "AI" topic in the media is so massively overblown on both fronts, but I guess it is good for traffic and nuance is dead anyway.

[-] homesweethomeMrL@lemmy.world 10 points 9 months ago

Indeed, although every one of us who have seen a tech hype train once or twice expected nothing less.

PDAs? Quantum computing. Touch screens. Siri. Cortana. Micropayments. Apps. Synergy of desktop and mobile.

From the outset this went from “hey that’s kind of neat” to quite possibly toppling some giants of tech in a flash. Now all we have to do is wait for the boards to give huge payouts to the pinheads that drove this shitwagon in here and we can get back to doing cool things without some imaginary fantasy stapled on to it at the explicit instruction of marketing and channel sales.

[-] Soyweiser@awful.systems 16 points 9 months ago* (last edited 9 months ago)

Xml also used to be a tech hype for a bit.

And i still remember how media outlets hyped up second life, forgot about it and a few months later discovered it again and more hype started. It was fun.

[-] dgerard@awful.systems 19 points 9 months ago

and then spent the entire Metaverse hype pretending Second Life didn't exist

[-] Soyweiser@awful.systems 15 points 9 months ago

Lot easier to do hype when you pretend the previous iterations didn't exist. (and still do, and actually have more content).

[-] froztbyte@awful.systems 12 points 9 months ago

./^ L E G S ^\.

[-] bitofhope@awful.systems 14 points 9 months ago

Oh man, XML is such a funny hype. What if we took S-expressions and made them less human readable, harder to parse programmatically and with multiple ways to do the same thing! Do I encode something an an element with the key as a tag and the value as the content, or do I make it an attribute of a tag? Just look at the schema, which is yet more XML! Include this magic URL at the top of your document. Want to query something from the document? Here you go! No, that's not a base64-encoded private key nor a transcript of someone's editing session in vim, that's an XPath.

JSON has its issues but at least it's only the worst of some worlds. Want to make JSON unparsable anyway, for a laugh? Try YAML, the serialization format recommended by four out of five Nordic countries!

[-] Soyweiser@awful.systems 10 points 9 months ago

No, that’s not a base64-encoded private key nor a transcript of someone’s editing session in vim, that’s an XPath.

lol

load more comments (5 replies)

[-] froztbyte@awful.systems 12 points 9 months ago

this reminds me of some of the more cursed things I know from that hype era

(see this for some others)

[-] self@awful.systems 11 points 9 months ago

Sarvega, Inc., the leading provider of high-performance XML networking solutions, today announced the Sarvega XML Context™ Router, the first product to enable loosely coupled multi-point XML Web Services across wide area networks (WANs). The Sarvega XML Context Router is the first XML appliance to route XML content at wire speed based on deep content inspection, supporting publish-subscribe (pub-sub) models while simultaneously providing secure and reliable delivery guarantees.

it’s fucking delicious how thick the buzzwords are for an incredibly simple device:

it parses XPath quickly (for 2004 (and honestly I never knew XPath and XQuery were a bottleneck… maybe this XML thing isn’t working out))
it decides which web app gets what traffic, but only if the web app speaks XML, for some reason
it implements an event queue, maybe?
it’s probably a thin proprietary layer with a Cisco-esque management CLI built on appropriated open source software, all running on a BSD but in a shiny rackmount case
the executive class at the time really had rediscovered cocaine, and that’s why we were all forced to put up with this bullshit
this shit still exists but it does the same thing with a semi-proprietary YAML and too much JSON as this thing does with XML, and now it’s in the cloud, cause the executive class never undiscovered cocaine

load more comments (2 replies)

load more comments (1 replies)

load more comments (9 replies)

[-] astrsk@fedia.io 13 points 9 months ago

Isn’t OpenAI saying that o1 has reasoning as a specific selling point?

[-] froztbyte@awful.systems 14 points 9 months ago

they do say that, yes. it’s as bullshit as all the other claims they’ve been making

load more comments (1 replies)

[-] homesweethomeMrL@lemmy.world 12 points 9 months ago

They say a lot of stuff.

load more comments (3 replies)

[-] conciselyverbose@sh.itjust.works 11 points 9 months ago* (last edited 9 months ago)

Yes.

But the lies around them are so excessive that it's a lot easier for executives of a publicly traded company to make reasonable decisions if they have concrete support for it.

load more comments (1 replies)

[-] homesweethomeMrL@lemmy.world 28 points 9 months ago

We suspect this research is likely part of why Apple pulled out of the recent OpenAI funding round at the last minute.

Perhaps the AI bros “think” by guessing the next word and hoping it’s convincing. They certainly argue like it.

🔥

[-] V0ldek@awful.systems 17 points 9 months ago* (last edited 9 months ago)

This has been said multiple times but I don't think it's possible to internalize because of how fucking bleak it is.

The VC/MBA class thinks all communication can be distilled into saying the precise string of words that triggers the stochastically desired response in the consumer. Conveying ideas or information is not the point. This is why ChatGPT seems like the holy grail to them, it effortlessly^1^ generates mountains of corporate slop that carry no actual meaning. It's all form and no substance, because those people -- their entire existence, the essence of their cursed dark souls -- has no substance.

^1^ batteries not included

[-] o7___o7@awful.systems 14 points 9 months ago

The only difference between the average VC and the average Sovereign Citizen is income.

load more comments (2 replies)

[-] lunarul@lemmy.world 12 points 9 months ago* (last edited 9 months ago)

Perhaps the AI bros “think” by guessing the next word and hoping it’s convincing

Perhaps? Isn't that the definition of LLMs?

Edit: oh, i just realized it's not talking about the LLMs, but about their apologists

[-] EnderMB@lemmy.world 17 points 9 months ago* (last edited 9 months ago)

"sigh"

(Preface: I work in AI)

This isn't news. We've known this for many, many years. It's one of the reasons why many companies didn't bother using LLM's in the first place, that paired with the sheer amount of hallucinations you'll get that'll often utterly destroy a company's reputation (lol Google).

With that said, for commercial services that use LLM's, it's absolutely not true. The models won't reason, but many will have separate expert agents or API endpoints that it will be told to use to disambiguate or better understand what is being asked, what context is needed, etc.

It's kinda funny, because many AI bros rave about how LLM's are getting super powerful, when in reality the real improvements we're seeing is in smaller models that teach a LLM about things like Personas, where to seek expert opinion, what a user "might" mean if they misspell something or ask for something out of context, etc. The LLM's themselves are only slightly getting better, but the thing that preceded them is propping them up to make them better

IMO, LLM's are what they are, a good way to spit information out fast. They're an orchestration mechanism at best. When you think about them this way, every improvement we see tends to make a lot of sense. The article is kinda true, but not in the way they want it to be.

[-] V0ldek@awful.systems 23 points 9 months ago

(Preface: I work in AI)

Are they a serious researcher in ML with insights into some of the most interesting and complicated intersections of computer science and analytical mathematics, or a promptfondler that earns 3x the former's salary for a nebulous AI startup that will never create anything of value to society? Read on to find out!

[-] froztbyte@awful.systems 10 points 9 months ago

Read on to find out!

do i have to

[-] V0ldek@awful.systems 11 points 9 months ago

Welcome to the future! Suffering is mandatory!

[-] froztbyte@awful.systems 14 points 9 months ago

as a professional abyss-starer, I'm going to talk to my union about this

[-] blakestacey@awful.systems 14 points 9 months ago* (last edited 9 months ago)

(Preface: I work in AI)

Preface: repent for your sins in sackcloth and ashes.

IMO, LLM’s are what they are, a good way to spit information out fast.

Buh bye now.

[-] bitofhope@awful.systems 18 points 9 months ago

while true; do fortune; done is a good way to spit information out fast.

load more comments (3 replies)

[-] froztbyte@awful.systems 13 points 9 months ago

Oh what a sweet, sweet tune to end a Sunday to

[-] underscore_@sopuli.xyz 11 points 9 months ago

Arxiv paper link referenced in the article: https://arxiv.org/pdf/2410.05229

TechTakes