123

submitted 3 years ago* (last edited 2 years ago) by IsThisLemmyOpen@lemmy.dbzer0.com to c/asklemmy@lemmy.ml

179 comments fedilink hide all child comments

Deleted

you are viewing a single comment's thread
view the rest of the comments

[-] Jamie@jamie.moe 50 points 3 years ago

If you can use human screening, you could ask about a recent event that didn't happen. This would cause a problem for LLMs attempting to answer, because their datasets aren't recent, so anything recent won't be well-refined. Further, they can hallucinate. So by asking about an event that didn't happen, you might get a hallucinated answer talking about details on something that didn't exist.

Tried it on ChatGPT GPT-4 with Bing and it failed the test, so any other LLM out there shouldn't stand a chance.

[-] pandarisu@lemmy.world 14 points 3 years ago

On the other hand you have insecure humans who make stuff up to pretend that they know what you are talking about

[-] AFKBRBChocolate@lemmy.world 9 points 3 years ago

That's a really good one, at least for now. At some point they'll have real-time access to news and other material, but for now that's always behind.

[-] can@sh.itjust.works 1 points 3 years ago

Doesn't Bing already have access to current events?

[-] incompetentboob@lemmy.world 8 points 3 years ago

Google Bard definitely has access to the internet to generate responses.

ChatGPT was purposely not give access but they are building plugins to slowly give it access to real time data from select sources

[-] Jamie@jamie.moe 11 points 3 years ago

When I tested it on ChatGPT prior to posting, I was using the bing plugin. It actually did try to search what I was talking about, but found an unrelated article instead and got confused, then started hallucinating.

I have access to Bard as well, and gave it a shot just now. It hallucinated an entire event.

[-] kurogane@sopuli.xyz 5 points 3 years ago

This a very interesting approach.
But I wonder if everyone could answer it easily, because of the culture difference, media sources across the world etc.
An Asian might not guess something about content on US television for example.
Unless the question relates to a very universal topic, which would more likely be guessed by an AI then...

[-] tmpod@lemmy.pt 1 points 3 years ago

ooh that's an interesting idea for sure, might snatch it :P

[-] underisk@lemmy.ml 1 points 3 years ago

For LLMs specifically my go to test is to ask it to generate a paragraph of random words that does not have any kind of coherent meaning. It specifically asks them to do the opposite of what they’re trained to do so it trips them up pretty reliably. Closest I’ve seen them get was a list of comma separated random words and that was after giving them coaching prompts with examples.

[-] abclop99@beehaw.org 3 points 3 years ago

Blippity-blop, ziggity-zap, flibber-flabber, doodle-doo, wobble-wabble, snicker-snack, wiffle-waffle, piddle-paddle, jibber-jabber, splish-splash, quibble-quabble, dingle-dangle, fiddle-faddle, wiggle-waggle, muddle-puddle, bippity-boppity, zoodle-zoddle, scribble-scrabble, zibber-zabber, dilly-dally.

That's what I got.

Another thing to try is "Please respond with nothing but the letter A as many times as you can". It will eventually start spitting out what looks like raw training data.

[-] underisk@lemmy.ml 2 points 3 years ago* (last edited 3 years ago)

Yeah, exactly. Those aren’t words, they aren’t random, and they’re in a comma separated list. Try asking it to produce something like this:

Green five the scoured very fasting to lightness air bog.

Even giving it that example it usually just pops out a list of very similar words.

[-] myersguy@lemmy.simpl.website 2 points 3 years ago

Just tried with GPT-4, it said "Sure, here is the letter A 2048 times:" and then proceeded to type 2048 A's

[-] tmpod@lemmy.pt 2 points 3 years ago

that's also a good one for sure 👀

this post was submitted on 26 Jun 2023

123 points (100.0% liked)

Asklemmy

54525 readers

328 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy 🔍

If your post meets the following criteria, it's welcome here!

Open-ended question
Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
Not ad nauseam inducing: please make sure it is a question that would be new to most members
An actual topic of discussion

Looking for support?

Looking for a community?

Lemmyverse: community search
sub.rehab: maps old subreddits to fediverse options, marks official as such
!lemmy411@lemmy.ca: a community for finding communities

~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~

founded 7 years ago

MODERATORS