[-] diz@awful.systems 5 points 1 week ago

I seriously doubt he ever worked anywhere like that, not to mention that he’s too spineless to actually get in trouble IRL.

[-] diz@awful.systems 5 points 1 week ago

I think they worked specifically on cheating the benchmarks, though. As well as popular puzzles like pre existing variants of the river crossing - it is a very large puzzle category, very popular, if the river crossing puzzle is not on the list I don't know what would be.

Keep in mind that they are also true believers, too - they think that if they cram enough little pieces of logical reasoning, taken from puzzles, into the AI, then they will get robot god that will actually start coming up with new shit.

I very much doubt that there's some general reasoning performance improvement that results in these older puzzle variants getting solved, while new ones that aren't particularly more difficult, fail.

[-] diz@awful.systems 7 points 1 week ago* (last edited 1 week ago)

It's google though, if nobody uses their shit they just put it inside their search.

It's only gonna go away when they run out of cash.

edit: whoops replied to the wrong comment

[-] diz@awful.systems 6 points 3 weeks ago* (last edited 3 weeks ago)

Yeah it really is fascinating. It follows some sort of recipe to try to solve the problem, like it's trained to work a bit like an automatic algebra system.

I think they had employed a lot of people to write generators of variants of select common logical puzzles, e.g. river crossings with varying boat capacities and constraints, generating both the puzzle and the corresponding step by step solution with "reasoning" and re-printing of the state of the items on every step and all that.

It seems to me that their thinking is that successive parroting can amount to reasoning, if its parroting well enough. I don't think it can. They have this one-path approach, where it just tries doing steps and representing state, just always trying the same thing.

What they need for this problem is to take a different kind of step, reduction (the duck can not be left unsupervised -> the duck must be taken with me on every trip -> rewrite a problem without the duck and with 1 less boat capacity -> solve -> rewrite the solution with "take the duck with you" on every trip).

But if they add this, then there's two possible paths it can take on every step, and this thing is far too slow to brute force the right one. They may get it to solve my duck variant, but at the expense of making it fail a lot of other variants.

The other problem is that even seemingly most elementary reasoning involves very many applications of basic axioms. This is what doomed symbol manipulation "AI" in the past and this is what is dooming it now.

[-] diz@awful.systems 6 points 3 weeks ago* (last edited 3 weeks ago)

It’s a failure mode that comes from pattern matching without actual reasoning.

Exactly. Also looking at its chain-of-wordvomit (which apparently I can't share other than by cut and pasting it somewhere), I don't think this is the same as GPT 4 overfitting to the original river crossing and always bringing items back needlessly.

Note also that in one example it discusses moving the duck and another item across the river (so "up to two other items" works); it is not ignoring the prompt, and it isn't even trying to bring anything back. And its answer (calling it impossible) has nothing to do with the original.

In the other one it does bring items back, it tries different orders, even finds an order that actually works (with two unnecessary moves), but because it isn't an AI fanboy reading tea leaves, it still gives out the wrong answer.

Here's the full logs:

https://pastebin.com/HQUExXkX

Content warning: AI wordvomit which is so bad it is folded hidden in a google tool.

[-] diz@awful.systems 3 points 4 months ago* (last edited 4 months ago)

Full time AI grift jobs would of course be forever closed to any AI whistleblower. There's still a plenty of other jobs.

I did participate in the hiring process, I can tell you that at your typical huge corporation the recruiter / HR are too inept to notice that you are a whistleblower, and don't give a shit anyway. And of the rank and file who will actually google you, plenty enough people dislike AI.

At the rank and file level, the only folks who actually give a shit who you are are people who will have to work with you. Not the background check provider, not the recruiter.

[-] diz@awful.systems 3 points 6 months ago* (last edited 6 months ago)

Using tools from physics to create something that is popular but unrelated to physics is enough for the nobel prize in physics?

If only, it's not even that! Neither Boltzmann machines nor Hopfield networks led to anything used in the modern spam and deepfake generating AI, nor in image recognition AI, or the like. This is the kind of stuff that struggles to get above 60% accuracy on MNIST (hand written digits).

Hinton went on to do some different stuff based on backpropagation and gradient descent, on newer computers than those who came up with it long before him, and so he got Turing Award for that, and it's a wee bit controversial because of the whole "people doing it before, but on worse computers, and so they didn't get any award" thing, but at least it is for work that is on the path leading to modern AI and not for work that is part of the vast list of things that just didn't work and it's extremely hard to explain why you would even think they would work in the first place.

[-] diz@awful.systems 3 points 6 months ago

Then next year Hopfield and Hinton go back to Sweden, don't tell king of Sweden anything, king of Sweden still gives them the Nobel Prize! King of Sweden now has conditioned reflex!

[-] diz@awful.systems 7 points 6 months ago

Maybe if the potato casserole is exploded in the microwave by another physicist, on his way to start a resonance cascade...

(i'll see myself out).

[-] diz@awful.systems 6 points 10 months ago* (last edited 10 months ago)

I tried the same prompt a lot of times and saw "chain of thought" attempts complete with the state modeling... they must be augmenting the training dataset with some sort of script generated crap.

I have to say those are so far the absolute worst attempts.

Day 16 (Egg 3 on side A; Duck 1, Duck 2, Egg 1, Egg 2 on side B): Janet takes Egg 3 across the river.

"Now, all 2 ducks and 3 eggs are safely transported across the river in 16 trips."

I kind of feel that this undermines the whole point of using transformer architecture instead of a recurrent neural network. Machine learning sucks at recurrence.

[-] diz@awful.systems 5 points 10 months ago

Well I'll be damned, they are trying to get it to work as if it had internal state, by augmenting their training dataset with some auto-generated crap. Of course, if machine learning wasn't shit at recurrence they wouldn't be using transformer architecture, so it doesn't work.

I got the following from chatgpt:

Janet’s ducks lay 3 eggs per day. Each day, she transports one duck and all the eggs to the other side of the river using a boat that can hold her and one other item (either a duck or an egg). How does she do that? ChatGPT

Janet can transport the ducks and eggs across the river using the following strategy:

Day 1 (Duck A, Duck B, Duck C on side A; Egg 1, Egg 2, Egg 3 on side A):
    Janet takes Duck A across the river.

Day 2 (Duck B, Duck C on side A; Egg 1, Egg 2, Egg 3 on side A):
    Janet takes Egg 1 across the river.

Day 3 (Duck B, Duck C on side A; Egg 2, Egg 3 on side A; Duck A on side B):
    Janet takes Duck A back to side A.

Day 4 (Duck C on side A; Egg 2, Egg 3 on side A; Duck A on side A; Egg 1 on side B):
    Janet takes Duck B across the river.

Day 5 (Duck C on side A; Egg 2, Egg 3 on side A; Duck A on side A; Egg 1, Duck B on side B):
    Janet takes Egg 1 back to side A.

Day 6 (Duck C on side A; Egg 2, Egg 3 on side A; Duck A, Egg 1 on side A; Duck B on side B):
    Janet takes Duck A across the river again.

Day 7 (Egg 2, Egg 3 on side A; Duck A, Egg 1 on side A; Duck B on side B; Duck C on side A):
    Janet takes Duck C across the river.

Day 8 (Egg 2, Egg 3 on side A; Duck A, Egg 1, Duck C on side A; Duck B on side B):
    Janet takes Egg 2 across the river.

Day 9 (Egg 3 on side A; Duck A, Egg 1, Duck C on side A; Duck B on side B; Egg 2 on side B):
    Janet takes Duck A back to side A.

Day 10 (Egg 3 on side A; Duck C on side A; Duck A on side A; Duck B, Egg 1, Egg 2 on side B):
    Janet takes Duck B across the river.

Day 11 (Egg 3 on side A; Duck C on side A; Duck A, Duck B on side A; Egg 1, Egg 2 on side B):
    Janet takes Egg 1 back to side A.

Day 12 (Egg 3 on side A; Duck C on side A; Duck A, Egg 1, Duck B on side A; Egg 2 on side B):
    Janet takes Duck A across the river again.

Day 13 (Duck C on side A; Duck A, Egg 1, Duck B on side A; Egg 2 on side B; Egg 3 on side B):
    Janet takes Duck C across the river.

By following this sequence, Janet successfully transports all 3 ducks and all 3 eggs across the river in 13 trips. Each trip involves either transporting one duck or one egg, adhering to the constraint that Janet can only carry one item (duck or egg) with her in the boat.

[-] diz@awful.systems 6 points 10 months ago

Well the problem is it not having any reasoning period.

Not clear what symbolic reasoning would entail, but puzzles generally require you to think through several approaches to solve them, too. That requires a world model, a search, etc. the kind of stuff that actual AIs, even a tik tac toe AI, have, but LLMs don't.

On top of it this all works through machine learning, which produces the resulting network weights through very gradual improvement at next word prediction, tiny step by tiny step. Even if some sort of discrete model (like say the account of what's on either side of the river) could help it predict the next token, there isn't a tiny fraction of a discrete "model" that would help it, and so it simply does not go down that path at all.

view more: ‹ prev next ›

diz

joined 2 years ago