477

Someone got Gab's AI chatbot to show its instructions (mbin.grits.dev)

submitted 2 years ago by mozz@mbin.grits.dev to c/technology@beehaw.org

167 comments fedilink hide all child comments

Credit to @bontchev

you are viewing a single comment's thread
view the rest of the comments

[-] GammaGames@beehaw.org 16 points 2 years ago

I love how dumb these things are, some of the creative exploits are entertaining!

[-] MachineFab812@discuss.tchncs.de 2 points 2 years ago

The AI figured out a way around the garbage it was fed by idiots, and told on them for feeding it garbage. That's the opposite of dumb.

[-] melmi 15 points 2 years ago

That's not what's going on here. It's just doing what it's been told, which is repeating the system prompt. It has nothing to do with Gab, this trick or variations of it work on pretty much any GPT deployment.

We need to be careful about anthropomorphizing AI.

[-] MachineFab812@discuss.tchncs.de 2 points 2 years ago

It works because the AI finds and exploits the flaws in the prompt, as it has been trained to do. A conversational AI that couldn't do so wouldn't meet the definition of such.

Anthropomorphizing? Put it this way: The writers of that prompt apparently believed it would work to conceal the instructions in it. That shows them to be idiots without getting into anything else about them. The AI doesn't know or believe any of that, and it doesn't have to, but it doesn't have to be anthropomorphic or "intelligent" to be "smarter" than people who consume their own mental excrement like so.

Blanket Time/Blanket Training(look it up), sadly, apparently works on some humans. AI seems to be already doing better than that. "Dumb" isn't the word to be using for it, least of all in comparison to the damaged morons trying to manipulate it in the manner shown in the OP.

[-] MachineFab812@discuss.tchncs.de 1 points 2 years ago

It works because the AI finds and exploits the flaws in the prompt, as it has been trained to do.

this post was submitted on 15 Apr 2024

477 points (100.0% liked)

Technology

43062 readers

198 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 4 years ago

MODERATORS

alyaza@beehaw.org

TheRtRevKaiser@beehaw.org

gyrfalcon@beehaw.org

rs5th@beehaw.org

coldredlight@beehaw.org

SemioticStandard@beehaw.org

TheRtRevKaiser@kbin.social

remington@beehaw.org