410
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 16 May 2025
410 points (100.0% liked)
Technology
70048 readers
3509 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
That's a good reason to use open source models. If your provider does something you don't like, you can always switch to another one, or even selfhost it.
Or better yet, use your own brain.
Yep, not arguing for the use of generative AI in the slightest. I very rarely use it myself.
While true, it doesn't keep you safe from sleeper agent attacks.
These can essentially allow the creator of your model to inject (seamlessly, undetectably until the desired response is triggered) behaviors into a model that will only trigger when given a specific prompt, or when a certain condition is met. (such as a date in time having passed)
https://arxiv.org/pdf/2401.05566
It's obviously not as likely as a company simply tweaking their models when they feel like it, and it prevents them from changing anything on the fly after the training is complete and the model is distributed, (although I could see a model designed to pull from the internet being given a vulnerability where it queries a specific URL on the company's servers that can then be updated with any given additional payload) but I personally think we'll see vulnerabilities like this become evident over time, as I have no doubts it will become a target, especially for nation state actors, to simply slip some faulty data into training datasets or fine-tuning processes that get picked up by many models.