[-] diz@awful.systems 2 points 2 days ago* (last edited 2 days ago)

Yolo charging mode on a phone, disable the battery overheating sensor and the current limiter.

I suspect that they added yolo mode because without it this thing is too useless.

[-] diz@awful.systems 19 points 1 week ago* (last edited 1 week ago)

Further support for the memorization claim: I posted examples of novel river crossing puzzles where LLMs completely fail (on this forum).

Note that Apple’s actors / agents river crossing is a well known “jealous husbands” variant, which you can ask a chatbot to explain to you. It gladly explains, even as it can’t follow its own explanation (since of course it isn’t its own explanation but a plagiarized one, even if changes words).

edit: https://awful.systems/post/4027490 and earlier https://awful.systems/post/1769506

I think what I need to do is to write up a bunch of puzzles, assign them randomly to 2 sets, and test & post one set, while holding back on the second set (not even testing it on any online chatbots). Then in a year or two see how much the set that's public improves, vs the one that's held back.

[-] diz@awful.systems 15 points 3 weeks ago* (last edited 3 weeks ago)

I was trying out free github copilot to see what the buzz is all about:

It doesn't even know its own settings. This one little useful thing that isn't plagiarism, providing natural language interface to its own bloody settings, it couldn't do.

[-] diz@awful.systems 17 points 3 weeks ago* (last edited 3 weeks ago)

All joking aside, there is something thoroughly fucked up about this.

What's fucked up is that we let these rich fucks threaten us with extinction to boost their stock prices.

Imagine if some cold fusion scammer was permitted to gleefully boast that his experimental cold fusion plant in the middle of a major city could blow it up. Setting up little hydrogen explosions, setting up a neutron source just to make it spicier, etc.

[-] diz@awful.systems 35 points 1 month ago* (last edited 1 month ago)

Actually, having read it carefully, it is interesting that they actually don't claim it was hacked, they claim that the modification was unauthorized. They also don't claim that they removed the access from that mysterious "employee" who modified it. I'm thinking they had some legal reason to technically not lie.

[-] diz@awful.systems 30 points 11 months ago

AI peddlers just love any "critique" that presumes the AI is great at something.

Safety concern that LLMs would go Skynet? Say no more, I hear you and I'll bring it up first thing in the Congress.

Safety concern that terrorists might use it to make bombs? Say no more! I agree that the AI is so great for making bombs! We'll restrict it to keep people safe!

It sounds too horny, you say? Yeah, good point, I love it. Our technology is better than sex itself! We'll keep it SFW to keep mankind from going extinct due to robosexuality!

[-] diz@awful.systems 17 points 11 months ago

Yeah I think that's why we need an Absolute Imbecile Level Reasoning Benchmark.

Here's what the typical PR from AI hucksters looks like:

https://www.anthropic.com/news/claude-3-family

Fully half of their claims about performance are for "reasoning", with names like "Graduate Level Reasoning". OpenAI is even worse - recall theirs claiming to have gotten 90th percentile on LSAT?

On top of it, LLMs are fine tuned to convince some dumb ass CEO who "checks it out". Even though you can pay for the subscription, you're neither the customer nor the product, you're just collateral eyeballs on the ad.

[-] diz@awful.systems 23 points 11 months ago

Both parties are buying into a premise we already know to be incorrect.

We may know it is incorrect, but LLM salesmen are claiming things like "90th percentile on LSAT", high scores on a "college level reasoning benchmark" and so on and so forth.

They are claiming "yeah yeah there's all the anekdotal reports of glue pizza, but objectively, our AI is more capable than your workers, so you can replace them with our AI", and this is starting to actually impact the job market.

[-] diz@awful.systems 24 points 11 months ago

Other thing to add to this is that there's just one or two people in the train providing service for hundreds of other people or millions of dollars worth of goods. Automating those people away is simply not economical, not even in terms of the headcount replaced vs headcount that has to be hired to maintain the automation software and hardware.

Unless you're a techbro, who deeply resents labor, someone who would rather hire 10 software engineers than 1 train driver.

[-] diz@awful.systems 23 points 11 months ago* (last edited 11 months ago)

Also, my thought on this is that since an LLM has no internal state with which to represent the state of the problem, it can't ever actually solve any variation of the river crossing. Not even those that it "solves" correctly.

If it outputs the correct sequence, inside your head the model of the problem will be in the solved state, but on the LLM's side there's just a sequence of steps that it wrote down, with those steps directly inhibiting production of another "Trip" token, until that crosses a threshold. There isn't an inventory or even a count of items, there's an unrelated number that weights for or against "Trip".

If we are to anthropomorphize it (which we shouldn't, but anyway), it's bullshitting up an answer and it gradually gets a feeling that it has bullshitted enough, which can happen at the right moment, or not.

[-] diz@awful.systems 23 points 1 year ago* (last edited 1 year ago)

I love the "criti-hype". AI peddlers absolutely love any concerns that imply that the AI is really good at something.

Safety concern that LLMs would go Skynet? Say no more, I hear you and I'll bring it up in the congress!

Safety concern that terrorists might use it to make bombs? Say no more! I agree that the AI is so great for making bombs! We'll restrict it to keep people safe!

Sexual roleplay? Yeah, good point, I love it. Our technology is better than sex itself! We'll restrict it to keep mankind from falling into the sin of robosexuality and going extinct! I mean, of course, you can't restrict something like that, but we'll try, at least until we release a hornybot.

But any concern about language modeling being fundamentally not the right tool for some job (Do you want to cite a paper or do you want to sample from the underlying probability distribution?), hey hey hows about we talk about the skynet thing instead?

[-] diz@awful.systems 18 points 1 year ago* (last edited 1 year ago)

It used to mean things like false positives in computer vision, where it is sort of appropriate: the AI is seeing something that's not there.

Then the machine translation people started misusing the term when their software mistranslated by adding something that was not present in the original text. They may have been already trying to be misleading with this term, because "hallucination" implies that the error happens when parsing the input text - which distracts from a very real concern about the possibility that what was added was being plagiarized from the training dataset (which carries risk of IP contamination).

Now, what's happening is that language models are very often a very wrong tool for the job. When you want to cite a court case as a precedent, you want a court case that actually existed - not a sample from the underlying probability distribution of possible court cases! LLM peddlers don't want to ever admit that an LLM is the wrong tool for that job, so instead they pretend that it is the right tool that, alas, sometimes "hallucinates".

view more: ‹ prev next ›

diz

joined 2 years ago