overview for diz

Oliver Habryka truly understands the soul of SneerClub by diz in c/sneerclub@awful.systems

[-] diz@awful.systems 15 points 2 months ago* (last edited 2 months ago)

Lol I literally told these folks, something like 15 years ago, that paying to elevate a random nobody like Yudkowsky as the premier “ai risk” researcher, in so much that there is any AI risk, would only increase it.

Boy did I end up more right on that than my most extreme imagination. All the moron has accomplished in life was helping these guys raise cash due to all his hype about how powerful the AI would be.

The billionaires who listened are spending hundreds of billions of dollars - soon to be trillions, if not already - on trying to prove Yudkowsky right by having an AI kill everyone. They literally tout “our product might kill everyone, idk” to raise even more cash. The only saving grace is that it is dumb as fuck and will only make the world a slightly worse place.

Cybersecurity ‘red teams’ to UK government: AI is rubbish by diz in c/techtakes@awful.systems

[-] diz@awful.systems 15 points 2 months ago

The problem is that to start breaking encryption you need quantum computing with a bunch of qubits as originally defined and not "our lawyer signed off on the claim that we have 1000 qubits".

We did it. 2 people and many boats problem is a classic now. [content warning: botshit] by diz in c/sneerclub@awful.systems

[-] diz@awful.systems 15 points 3 months ago* (last edited 3 months ago)

Yeah that's the version of the problem that chatgpt itself produced, with no towing etc.

I just find it funny that they would train on some sneer problem like this, to the point of making their chatbot look even more stupid. A "300 billion dollar" business, reacting to being made fun of by a very small number of people.

Google's Gemini 2.5 pro is out of beta. by diz in c/techtakes@awful.systems

[-] diz@awful.systems 14 points 4 months ago* (last edited 4 months ago)

misinterpreted as deliberate lying by ai doomers.

I actually disagree. I think they correctly interpret it as deliberate lying, but they misattribute the intent to the LLM rather than to the company making it (and its employees).

edit: its like you are watching a TV and ads come on you say that a very very flat demon who lives in the TV is lying, because the bargain with the demon is that you get to watch entertaining content in response to having to listen to its lies. It's fundamentally correct about lying, just not about the very flat demon.

Google's Gemini 2.5 pro is out of beta. by diz in c/techtakes@awful.systems

[-] diz@awful.systems 13 points 4 months ago* (last edited 4 months ago)

Hmm, fair point, it could be training data contamination / model collapse.

It's curious that it is a lot better at converting free form requests for accuracy, into assurances that it used a tool, than into actually using a tool.

And when it uses a tool, it has a bunch of fixed form tokens in the log. It's a much more difficult language processing task to assure me that it used a tool conditionally on my free form, indirect implication that the result needs to be accurate, than to assure me it used a tool conditionally on actual tool use.

The human equivalent to this is "pathological lying", not "bullshitting". I think a good term for this is "lying sack of shit", with the "sack of shit" specifying that "lying" makes no claim of any internal motivations or the like.

edit: also, testing it on 2.5 flash, it is quite curious: https://g.co/gemini/share/ea3f8b67370d . I did that sort of query several times and it follows the same pattern: it doesn't use a calculator, it assures me the result is accurate, if asked again it uses a calculator, if asked if the numbers are equal it says they are not, if asked which one is correct it picks the last one and argues that the last one actually used a calculator. I hadn't ever managed to get it to output a correct result and then follow up with an incorrect result.

edit: If i use the wording of "use an external calculator", it gives a correct result, and then I can't get it to produce an incorrect result to see if it just picks the last result as correct, or not.

I think this is lying without scare quotes, because it is a product of Google putting a lot more effort into trying to exploit Eliza effect to convince you that it is intelligent, than into actually making an useful tool. It, of course, doesn't have any intent, but Google and its employees do.

Google's Gemini 2.5 pro is out of beta. by diz in c/techtakes@awful.systems

[-] diz@awful.systems 14 points 4 months ago* (last edited 4 months ago)

The funny thing is, even though I wouldn't expect it to be, it is still a lot more arithmetically sound than what ever is it that is going on with it claiming to use a code interpreter and a calculator to double check the result.

It is OK (7 out of 12 correct digits) at being a calculator and it is awesome at being a lying sack of shit.

Apple: ‘Reasoning’ AIs fail hard if they actually have to think by diz in c/techtakes@awful.systems

[-] diz@awful.systems 13 points 4 months ago* (last edited 4 months ago)

Yeah any time its regurgitating an IMO problem it’s a proof it’salmost superhuman, but any time it actually faces a puzzle with unknown answer, this is not what it is for.

Stubsack: weekly thread for sneers not worth an entire post, week ending 1st June 2025 by diz in c/techtakes@awful.systems

[-] diz@awful.systems 15 points 5 months ago* (last edited 5 months ago)

I was trying out free github copilot to see what the buzz is all about:

It doesn't even know its own settings. This one little useful thing that isn't plagiarism, providing natural language interface to its own bloody settings, it couldn't do.

AI resorts to robot blackmail! — because Anthropic asked for a story of robot blackmail by diz in c/techtakes@awful.systems

[-] diz@awful.systems 17 points 5 months ago* (last edited 5 months ago)

All joking aside, there is something thoroughly fucked up about this.

What's fucked up is that we let these rich fucks threaten us with extinction to boost their stock prices.

Imagine if some cold fusion scammer was permitted to gleefully boast that his experimental cold fusion plant in the middle of a major city could blow it up. Setting up little hydrogen explosions, setting up a neutron source just to make it spicier, etc.

[long] Some tests of how much AI "understands" what it says (spoiler: very little) by diz in c/sneerclub@awful.systems

[-] diz@awful.systems 13 points 1 year ago

I feel like letter counting and other letter manipulation problems kind of under-sell the underlying failure to count - LLMs work on tokens, not letters, so they are expected to have a difficulty with letters.

The inability to count is of course wholly general - in a river crossing puzzle an LLM can not keep track of what's on either side of the river, for example, and sometimes misreports how many steps it output.

[long] Some tests of how much AI "understands" what it says (spoiler: very little) by diz in c/sneerclub@awful.systems

[-] diz@awful.systems 14 points 1 year ago

But if your response to the obvious misrepresentation that a chatbot is a person of ANY level of intelligence is to point out that it’s dumb you’ve already accepted the premise.

How am I accepting the premise, though? I do call it an Absolute Imbecile, but that's more of a word play on the "AI" moniker.

What I do accept is an unfortunate fact that they did get their "AIs" to score very highly on various "reasoning" benchmarks (some of their own design), standardized tests, and so on and so forth. It works correctly across most simple variations, such as changing the numbers in a problem or the word order.

They really did a very good job at faking reasoning. I feel that even though LLMs are complete bullshit, the sheer strength of that bullshit is easy to underestimate.

[long] Some tests of how much AI "understands" what it says (spoiler: very little) by diz in c/sneerclub@awful.systems

[-] diz@awful.systems 17 points 1 year ago

Yeah I think that's why we need an Absolute Imbecile Level Reasoning Benchmark.

Here's what the typical PR from AI hucksters looks like:

https://www.anthropic.com/news/claude-3-family

Fully half of their claims about performance are for "reasoning", with names like "Graduate Level Reasoning". OpenAI is even worse - recall theirs claiming to have gotten 90th percentile on LSAT?

On top of it, LLMs are fine tuned to convince some dumb ass CEO who "checks it out". Even though you can pay for the subscription, you're neither the customer nor the product, you're just collateral eyeballs on the ad.