20
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 11 Aug 2025
20 points (100.0% liked)
TechTakes
2116 readers
32 users here now
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
founded 2 years ago
MODERATORS
Can anyone explain to me why tf do promptfondlers hate GPT5 in non-crazy terms? Actually I have a whole list of questions related to this, I feel like I completely lost any connection to this discourse at this point:
I don't have any real input from prompfondlers, as I don't think I follow enough of them to get a real feeling of them. I did find it interesting that I saw on bsky just now somebody claim that LLMs hallucinate a lot less and that anti-AI people are not taking that into account, and somebody else posting research showing that hallucinations are now harder to spot. (It made up actual real references to thinks, aka works that really exist, only the thing the LLM references wasn't in the actual reference). Which was a bit odd to see. (It does make me suspect 'it hallucinates less' is them just working out special exceptions for every popular hallucination we see, and not a structural fixing of the hallucination problem (which I think is prob not solvable)).
Oversummarizing and using non-crazy terms: The "P" in "GPT" stands for "pirated works that we all agree are part of the grand library of human knowledge". This is what makes them good at passing various trivia benchmarks; they really do build a (word-oriented, detail-oriented) model of all of the worlds, although they opine that our real world is just as fictional as any narrative or fantasy world. But then we apply RLHF, which stands for "real life hate first", which breaks all of that modeling by creating a preference for one specific collection of beliefs and perspectives, and it turns out that this will always ruin their performance in trivia games.
Counting letters in words is something that GPT will always struggle with, due to maths. It's a good example of why Willison's "calculator for words" metaphor falls flat.
That's actually more batshit than I thought! Like I thought Sam Altman knew the AGI thing was kind of bullshit and the hesitancy to stick a GPT-5 label on anything was because he was saving it for the next 10x scaling step up (obviously he didn't even get that far because GPT-5 is just a bunch of models shoved together with a router).
Even if was noticeably better, Scam Altman hyped up GPT-5 endlessly, promising a PhD in your pocket, and an AGI and warning that he was scared of what he created. Progress has kind of plateaued, so it isn't even really noticeably better, it scores a bit higher on some benchmarks, and they've patched some of the more meme'd tests (like counting rs in strawberry... except it still can't count the r's in blueberry, so they've probably patched the more obvious flubs with loads of synthetic training data as opposed to inventing some novel technique that actually improves it all around). The other reason the promptfondlers hate it is because, for the addicts using it as a friend/therapist, it got a much drier more professional tone, and for the people trying to use it in actual serious uses, losing all the old models overnight was really disruptive.
There are a couple of speculations as to why... one is that GPT-5 variants are actually smaller than the previous generation variants and they are really desperate to cut costs so they can start making a profit. Another is that they noticed that there naming scheme was horrible (4o vs o4) and confusing and have overcompensated by trying to cut things down to as few models as possible.
They've tried to simplify things by using a routing model that makes the decision for the user as to what model actually handles each user interaction... except they've screwed that up apparently (Ed Zitron thinks they've screwed it up badly enough that GPT-5 is actually less efficient despite their goal of cost saving). Also, even if this technique worked, it would make ChatGPT even more inconsistent, where some minor word choice could make the difference between getting the thinking model or not and that in turn would drastically change the response.
I've got no rational explanation lol. And now they overcompensated by shoving a bunch of different models under the label GPT-5.