submitted 2 years ago by JRL@lemmy.world to c/perchance@lemmy.world

4 comments fedilink hide all child comments

Hi!

I have a question about the privacy of the prompts and responses from the AI chat. Are the prompts and responses stored in any way? Does anyone have access to them? Does anyone review them?

I'm wondering how safe it is to provide personal information in the chat.

Thanks in advance!

top 4 comments

sorted by: hot top controversial new old

[-] perchance@lemmy.world 6 points 2 years ago* (last edited 6 months ago)

Prompts and responses are not stored - i.e. once the server sends back the response, the text you sent and the generated response text should exist on your computer only. However note that, as @april@lemmy.world says, it is a bad idea to put sensitive personal info (i.e. more than perhaps your first name) into any online service like this, even if you fully trust the person/company running it and they've assured you that it's 100% private.

If you want to run AI text models ("LLMs") completely privately, this is probably the best place right now to learn more: https://www.reddit.com/r/LocalLLaMA

You can stop reading now if the point is taken, but if not, here is (what accidentally became) a wall of text to scare you:

I recently realised that if there happens to be an error during inference, then the server "leaks" the prompt into the (temporary) server logs (this was actually fixed a few hours ago: https://github.com/huggingface/text-generation-inference/releases/tag/v1.2.0 ). This particular issue is basically harmless because the server logs themselves obviously aren't public, and they're not saved to a persistent drive, but it gives you an idea of the sort of thing that can occur even if you fully trust the provider and they're acting benevolently.
If there's a bug where e.g. someone's requests are bringing down the server (e.g. tokenizer issue and stop sequence stuff have caused problems in the past) either accidentally or maliciously, investigating it often requires temporarily logging of the problematic requests. It's impossible for a provider to claim with 100% certainty that your data will never be seen by a human. This is just the nature of building a complex platform, debugging it, fending off malicious users, etc.
If you're using a generator on Perchance that someone else made (i.e. not one that you have coded yourself), then note that generator authors can code up arbitrary interfaces with arbitrary logic, so they could send your inputs off to their own server. This is of course the same as any other page on the internet that accepts user input, although Perchance does have the slight advantage that all generators have publicly-viewable and non-obfuscated code (via the edit button in the top-right of the page), so it's not as simple for a coder on Perchance to get away with this, especially if the generator has become decently popular - someone would notice. Either way, you're running some random person's code - by default you should assume it is unfriendly. EDIT: I added the ability to prevent a generator from accessing external resources:
- https://perchance.org/custom-content-security-policy?$csp
- Context in this post: https://lemmy.world/comment/14698327
Services can get hacked, and the more popular they become, the bigger the target painted on them. If a service has user accounts associated with the user chat data then if someone gets access to the database, they have a list of emails (or phone numbers, or whatever), with all the associated user data. All the benevolence in the world won't make you immune from this. To be clear, Perchance does not associate user accounts with AI plugin requests - Perchance accounts are purely for people who want to build generators on Perchance - i.e. the only data that's associated with a Perchance user account is the generators/pages that they've created. All requests to the AI plugin servers are anonymous, regardless of whether you're logged in or not. But if you embed your personal info within those AI requests (i.e. in the text prompt) then nothing can save you, since a compromised server would mean the attacker could see all the data flowing through the server - i.e. prompts that include your sensitive personal info.

TL;DR: You should limit the personal info that you put into web pages/apps on the internet, regardless of any assurances given, even if you trust the dev/company. Swap out personal info for fake stuff, and if that's not possible, then it's time to save up for a second-hand RTX 3090.

While I'm here, @VioneT@lemmy.world, IIRC someone asked a similar question (here or maybe on the /hub) about the text-to-image-plugin and I think you answered it, but just to be clear, the answer is the same as above: Images are only stored if they save it to the gallery, otherwise it's gone forever. I'll link this comment on both the image and text AI plugins.

Some minor notes:

The (currently undocumented) responseObj.submitUserRating feature of the ai-text-plugin allows users to rate a response to help improve the AI. If you submit a rating on a generator that uses this feature (like the thumbs up/down on perchance.org/ai-chat), then the response that you're rating is temporarily stored as part of a scoring process to determine which of several "candidate variations" of an LLM is best (varying settings, model, etc.). The generator author should make it clear that ratings should not be submitted for text that contains personal data - I may make the plugin automatically show an informational message about this in a future update.
Some aggregate statistics are collected about prompts that are flowing through the manager server - e.g. I started collecting the ratio of PG-13 vs not on the image generation server so that I'd know if I accidentally broke my detection algorithm (such that e.g. it was flagging everything as nsfw, which happened once due to a regex error). Nothing in these aggregate statistics is even remotely private/personal. Just extremely high-level numerical statistics aggregated from literally millions of requests.
There's also a script which tracks how many requests each IP has made in the past 2 days, which allows me to do rate limiting, and track abusive IPs. Again, prompts are not stored so there's no association between IPs and prompts - it's literally just a counter that says this IP made this many requests, and it's cleared every 2 days.
As you probably know, the AI plugin servers are funded by ads. Perchance in general doesn't have any ads, and has always been completely free, but the AI plugins are way too expensive to fund out of my own bank account. So if you're not logged in, you'll see ads on generators that use AI plugins. I figured it's worth mentioning here that unlike basically every other ad-funded site on the internet, Perchance does not trust ads. Perchance has a sand-boxed separation between the actual generator/page contents (which live in a "iframe" - it's basically like a separate browser tab embedded within the page), and the place where ad code runs -- so ads cannot look at your chat/text/image/etc. data in order to guess at more relevant ads. Perchance uses a very reputable advertising company (same one used by Reuters and Aljazeera and several other large companies) so the likelihood of shady ad tech is already extremely low, but there's no need for any trust here, thanks to the sand-boxing that Perchance has. So, in terms of showing you more relevant ads, all they can possibly see is the URL of the page that you're on. That's the only thing that's exposed to ad serving algorithms by visiting a Perchance page, no matter how much information you input into a Perchance generator/page.

[-] JRL@lemmy.world 5 points 2 years ago

Thank you! That answers my every question and more. I wish every service provider would be as transparent as this.

[-] april@lemmy.world 5 points 2 years ago

Assume any service offering AI is logging everything. The only way to be sure is to run the model on your own hardware.

[-] VioneT@lemmy.world 3 points 2 years ago

Pinging dev @perchance@lemmy.world.

this post was submitted on 30 Nov 2023

3 points (100.0% liked)

Perchance - Create a Random Text Generator

1035 readers

14 users here now

⚄︎ Perchance

This is a Lemmy Community for perchance.org, a platform for sharing and creating random text generators.

Feel free to ask for help, share your generators, and start friendly discussions at your leisure :)

This community is mainly for discussions between those who are building generators. For discussions about using generators, especially the popular AI ones, the community-led Casual Perchance forum is likely a more appropriate venue.

See this post for the Complete Guide to Posting Here on the Community!

Rules

1. Please follow the Lemmy.World instance rules.

The full rules are posted here: (https://legal.lemmy.world/)
User Rules: (https://legal.lemmy.world/fair-use/)

2. Be kind and friendly.

Please be kind to others on this community (and also in general), and remember that for many people Perchance is their first experience with coding. We have members for whom English is not their first language, so please be take that into account too :)

3. Be thankful to those who try to help you.

If you ask a question and someone has made a effort to help you out, please remember to be thankful! Even if they don't manage to help you solve your problem - remember that they're spending time out of their day to try to help a stranger :)

4. Only post about stuff related to perchance.

Please only post about perchance related stuff like generators on it, bugs, and the site.

5. Refrain from requesting Prompts for the AI Tools.

We would like to ask to refrain from posting here needing help specifically with prompting/achieving certain results with the AI plugins (text-to-image-plugin and ai-text-plugin) e.g. "What is the good prompt for X?", "How to achieve X with Y generator?"
See Perchance AI FAQ for FAQ about the AI tools.
You can ask for help with prompting at the 'sister' community Casual Perchance, which is for more casual discussions.
We will still be helping/answering questions about the plugins as long as it is related to building generators with them.

6. Search through the Community Before Posting.

Please Search through the Community Posts here (and on Reddit) before posting to see if what you will post has similar post/already been posted.

founded 2 years ago

MODERATORS

eatham@lemmy.world

eatham@aussie.zone

VioneT@lemmy.world

perchance@lemmy.world