477
Someone got Gab's AI chatbot to show its instructions
(mbin.grits.dev)
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
I love how dumb these things are, some of the creative exploits are entertaining!
The AI figured out a way around the garbage it was fed by idiots, and told on them for feeding it garbage. That's the opposite of dumb.
That's not what's going on here. It's just doing what it's been told, which is repeating the system prompt. It has nothing to do with Gab, this trick or variations of it work on pretty much any GPT deployment.
We need to be careful about anthropomorphizing AI.
It works because the AI finds and exploits the flaws in the prompt, as it has been trained to do. A conversational AI that couldn't do so wouldn't meet the definition of such.
Anthropomorphizing? Put it this way: The writers of that prompt apparently believed it would work to conceal the instructions in it. That shows them to be idiots without getting into anything else about them. The AI doesn't know or believe any of that, and it doesn't have to, but it doesn't have to be anthropomorphic or "intelligent" to be "smarter" than people who consume their own mental excrement like so.
Blanket Time/Blanket Training(look it up), sadly, apparently works on some humans. AI seems to be already doing better than that. "Dumb" isn't the word to be using for it, least of all in comparison to the damaged morons trying to manipulate it in the manner shown in the OP.
It works because the AI finds and exploits the flaws in the prompt, as it has been trained to do.