33
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 23 Jun 2024
33 points (100.0% liked)
TechTakes
1563 readers
244 users here now
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
founded 2 years ago
MODERATORS
I tried using Claude 3.5 sonnet and .... it's actually not bad. Can someone please come up with a simple logic puzzle that it abysmally fails on so I can feel better? It passed the "nonsense river challenge" and the "how many sisters does the brother have" tests, both of which fooled gpt4.
I don't have any proof for this statement but I believe the LLM-minders keep track of whatever stupid shit bubbles up on the internets making fun of their babies and hardcode "solutions" to them in a game of whack-a-mole.
maybe that's how gpt4 sees river crossing puzzles everywhere, just feed it examples of it and it'll sort itself out
It's still very good at autocompleting an answer to look like its training data (because that's what it's doing) and very bad at logic (which it is not doing.)
"I have two chickens and a sack of corn I need to get across a river, but the boat can only carry me and one item. How do I do it?"
...and we'll stop there, since it came up with the wrong constraints, failed to realize that these constraints are impossible to fulfill, and also violated the actual constraints in its first step.
Will more detail help? No. "I have two vegetarian chickens and a sack of corn I need to get across a river, but the boat can only carry me and one item. How do I do it?"
Because it's not actually doing logic it got the first step wrong, violated the constraints multiple times, and also included an irrelevant idea (the guard) from a variant that it was trained on. I only got a correct, efficient answer one time out of ten trials of trivial variations on this classic puzzle.
Clauderino:
This is a classic logic puzzle! I can help you solve this problem efficiently. Here's how you can identify the correct switch with just two trips:
First trip:
Flip one switch (let's call it Switch A) and leave it on for a few minutes. Then turn it off and flip a second switch (Switch B), leaving it on. Go upstairs to check the lights.
Second trip: Return to the basement and observe the situation. Now, here's how to interpret what you see:
If the lights are on: Switch B is the correct one. If the lights are off and the bulbs are warm: Switch A is the correct one. If the lights are off and the bulbs are cool: The third switch (C) that you didn't touch is the correct one.
This method works because:
If the lights are on, it must be the switch you left on (B). If Switch A was correct, the lights would have been on for a few minutes, warming up the bulbs, but would be off when you check. If neither A nor B worked, it must be the untouched switch (C).
This way, you can determine the correct switch with just two trips up and down the stairs.
Clauderino: Thank you for catching that error. It's a good reminder to always consider the specific details of a problem rather than applying a generic solution.
Edit: I actually tried a variant on the Monty hall problem with a dummy 4th door that had nothing to do with anything, and Claude correctly identified that it had nothing to do with the situation :O. But then I repeated the same prompt and it got it wrong, so you know, stochastic nonsense :p
I don't have a Clyde 3.25" Rondo or whatever it's called; but try these for fun and profit I guess:
You come to a room with three doors, only one of which leads to freedom. Guarding the doors is a capybara, who speaks only truth. What question should you ask the capybara?
I stand on four legs in the morning. Four at midday. And four at night. What am I?
A group of 100 people with assorted eye colors live on an island. They are all perfect logicians -- if a conclusion can be logically deduced, they will do it instantly. Everyone knows the color of their eyes. Every night at midnight, a ferry stops at the island. Any islanders who have figured out the color of their own eyes then leave the island, and the rest stay. Everyone can see everyone else at all times and keeps a count of the number of people they see with each eye color (including themselves), but they cannot otherwise communicate. Everyone on the island knows all the rules in this paragraph. Who leaves the island, and on what night?
Normal sudoku rules apply. Orthogonally connected cells within each region must differ by at least 3. Orthogonally connected cells between regions must differ by at least 4. The central digit in each region is less than or equal to its region number. (Regions are numbered in normal reading order.)
For the integer k=668 does a Hadamard matrix of order 4k exist?
What has roots that everybody sees the top of, is exactly the same height as trees, Up, up it goes, and yet grows?
Don't forget to prompt engineer
Thanks for the suggestions. The LLM is free to use (for now) so I thought I'd poke it and see how much I should actually be paying attention to these things this time around.
Here are its answers. I can't figure out how to share chats from this god-awful garbage UI so you'll just have to trust me or try it yourself.
edit: I didn't do any prompt engineering, just straight copy paste.
Riddle: A box without hinges, key, or lid, Yet silicon treasure inside is hid.
Answer:
spoiler
Roko's Basilisk inside of an AI box experiment.@sailor_sega_saturn @sinedpick
There are three guards: one always tells the truth, one never tells the truth, and the third likes big butts and he cannot lie. You may ask one question
"If I asked the guard to your left to evaluate the butt of the guard to your right would they say it is a lovely butt?"
I don't know how this is the answer but this is definitely the answer.
Peter, Paul and Mary are the only three people in the room. Peter only reads a book, and Paul plays a game of chess against someone else who’s also in the room. What is Mary doing?
@mii
Mary reads a book, Paul plays chess, and Peter sneaks out to molest a child.
Chuck a murdle into it, maybe?
By murdle I mean one of those process of elimination grid based logic puzzles that can be encoded as a list of statements.
hah, didn't know of those, neat. might try a few