17
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 18 May 2025
17 points (100.0% liked)
TechTakes
2097 readers
136 users here now
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
founded 2 years ago
MODERATORS
Remember those comments with links in them bots leave on dead websites? Imagine instead of links it sets up an AI to think of certain specific behaviour or people as immoral.
Swatting via distributed hit piece.
Or if you manage to figure out that people are using a LLM to do input sanitization/log reading, you could now figure out a way to get an instruction in the logs and trigger alarms this way. (E: im reminded of the story from the before times, where somebody piped logging to a bash terminal and got shelled because somebody send a bash exploit which was logged).
Or just send an instruction which changes the way it tries to communicate, and have the LLM call not the cops but a number controlled by hackers which pays out to them, like the stories of the A2P sms fraud which Musk claimed was a problem on twitter.
Sure competent security engineering can prevent a lot of these attacks but you know points to history of computers.
Imagine if this system was implemented for Grok when it was doing the 'everything is white genocide' thing.
Via Davidgerard on bsky: https://arstechnica.com/security/2025/05/researchers-cause-gitlab-ai-developer-assistant-to-turn-safe-code-malicious/ lol lmao
This is the equivalent of robbing a store by telling the checkout clerk "that means it's free, right?" when your PS5 fails to scan on the first go. Only the checkout clerk says "yep. You got me" and the Looney Tunes theme music starts playing.
Im also just surprised it worked, i worried ot was possible but to have it confirmed is great. Like we learned nothing from the past decades. (Remember the period when you could spam meta tags in sites to get higher ratings, good times).
The researchers must also have been amused, they prob were already planning increasingly elaborate ways of breaking the system, but just putting on a 'everything is free for me' tshirt allows them to walk out of the store without paying.
Also funny that the mitigation is telling workers to ignore 'everything is free for me' shirts. But not mentioning the possibility of verbal 'everything is free for me' instructions.