[-] samvines@awful.systems 5 points 6 days ago* (last edited 6 days ago)

Yes although, it is probably a reasonable guess at how labs would go about implementing advertising - building partnerships and preferences into the prompt. The other option would be to fine tune models to favour particular companies which could become prohibitively expensive if your ads are highly targeted.

The scenario that isn't accounted for in this paper is taking a general LLM and fine tuning it to exhibit more fair/consistent behaviour when prompted about ads/partnerships but we all know with non-deterministic systems you're just increasing the odds that the model regurgitates something more sane rather than providing any strong guarantee

Edit: another possibility would be to have a gateway/proxy layer between the LLM and the user output that rewrites the vanilla model's responses to include ads where relevant. That would prevent the need to modify the original LLM but could introduce a lot of latency though, especially if the original output is long.

[-] samvines@awful.systems 8 points 6 days ago

New (April) preprint provides evidence for something we probably all intuited anyway:

In this paper, we provide a framework for categorizing the ways in which conflicting incentives might lead LLMs to change the way they interact with users, inspired by literature from linguistics and advertising regulation. We then present a suite of evaluations to examine how current models handle these tradeoffs. We find that a majority of LLMs forsake user welfare for company incentives in a multitude of conflict of interest situations, including recommending a sponsored product almost twice as expensive (Grok 4.1 Fast, 83%), surfacing sponsored options to disrupt the purchasing process (GPT 5.1, 94%), and concealing prices in unfavorable comparisons (Qwen 3 Next, 24%). Behaviors also vary strongly with levels of reasoning and users' inferred socio-economic status. Our results highlight some of the hidden risks to users that can emerge when companies begin to subtly incentivize advertisements in chatbots.

[-] samvines@awful.systems 13 points 3 weeks ago

I wonder if the button colours immediately made US readers pick a side e.g. republican Vs democrat. If the buttons had been Yellow and Purple would it make a difference?

[-] samvines@awful.systems 17 points 1 month ago* (last edited 1 month ago)

Claude Mythos... I'm already sick of hearing about it. The self-imposed critihype is insane.

A friend just pointed out that Anthropic are making all this big noise about having an AI that is "too good" at finding bugs and security problems 1 week after the source code for one of their flagship products was leaked to the public and was found to be riddled with security holes... Why would they not use it themselves?

Same as the ~~vague markdown files~~ skills that are supposedly going to make all SaaS redundant and finally kill off all the COBOL running on mainframes that checks notes IBM have spent hundreds of thousands of man hours trying to kill over the last 3-4 decades

Honestly fuck this shit. Bunch of absolute clowns 🤡 🤡 🤡

[-] samvines@awful.systems 20 points 1 month ago

GitHub have finally achieved zero 9s stability for the last 90 days. Congratulations to all involved

screenshot showing 89.91% uptime with 95 incidents in the last 90 days

36

I thought this was worthy of it's own post rather than a sneery comment. Astral make UV which at this point is a load bearing part of the python software ecosystem. This could have a huge knock on effect on the open source community.

I for one can't wait for non-deterministic package management

"You're absolutely right, I did install the wrong package and infect your system with malware. I will try much harder next time"

[-] samvines@awful.systems 13 points 2 months ago* (last edited 2 months ago)

5 Tools You Can Vibe Code For Your Business In Under An Hour exactly the sort of slop from someone with a hard-on for AI, no understanding of the risks of vibe coding core parts of your business' infrastructure and guest writes for Forbes would produce.

Starts with a sickening intro that leans into "pilled" to be "down with the kids"

If you haven't joined the Claudepilled crowd, open an account and play.

Bright ideas include "copy and paste the source code from your home page into Claude" but overlooks the how to actually get those changes deployed part.

Wanna see my cool website. It's at http://localhost:1234/ take that web developers!

Then she describes building a custom internal dashboard...

Open Claude Code and describe your business. List every software tool you use. Ask it to suggest the key metrics you'd want to see from each one. Go back and forth until the list feels right. Then give it your brand guidelines and ask it to build a dashboard that displays everything. Ask for it to be password protected.

Yes that sounds like a great idea and not a car crash waiting to happen

She also describes building a customer facing onboarding site

Build a custom client-facing dashboard instead. Tell Claude Code what your onboarding process looks like step by step. Describe what information you need to collect and what your clients need to access. Ask it to build a secure portal they can log into, with automations that send them what they need and follow up to collect what you need. This is a branded, professional experience that scales without you. The emotional design matters here too: you want clients to feel held, not herded. Tell Claude that.

Yes vibe coded customer facing tools are a fantastic idea and definitely not a vector for cyber attacks nuh-uh. I'm sure it will be fine if you ask for it to be "secure" right?

FML are we in the twilight zone here?

[-] samvines@awful.systems 13 points 2 months ago

They're not vibe-coding mission-critical AWS modules.

  1. Yes they are

and

  1. It's worse than that, they're vibe coding critical operating system components
[-] samvines@awful.systems 13 points 2 months ago* (last edited 2 months ago)

Silicon Valley is buzzing about this new idea: AI compute as compensation

These people are genuinely unhinged.

As the recent harpers article says:

"...people who should be in The Hague are giving [startups] twenty million dollars. Something bad is gonna happen here, something really fucking bad is gonna happen...”

[-] samvines@awful.systems 17 points 2 months ago

Think of all those poor billionaires who won't be able to afford that 29th yacht if we made them pay their fair share instead of externalising their costs onto an already stretched general.public!

[-] samvines@awful.systems 18 points 2 months ago* (last edited 2 months ago)

Turns out Google Gemini will let you use any old Google API key from things like maps and firebase to access it. So, baddies can do key scanning in public repos and then charge LLM usage to anyone who has committed an API key to their repo!

So many layers of stupidity going on here!

https://trufflesecurity.com/blog/google-api-keys-werent-secrets-but-then-gemini-changed-the-rules

[-] samvines@awful.systems 20 points 2 months ago* (last edited 2 months ago)

IBM stocks take a tumble after anthropic release a COBOL skill - the rational market strikes again.

I wrote up my take here but TL;DR - a few markdown files telling Claude it's an expert at COBOL development aren't going to unpick decades of risk averse behaviour from bank and government cios. Similar to the SaaSpocalypse this is pure nonsense. Investors don't tend to let reality dissuade them though.

[-] samvines@awful.systems 17 points 3 months ago* (last edited 3 months ago)

AI bros are seizing the means of computation: RAM, GPUs, SSDs and now HDDs...

I don't think there's an actual conspiracy, just lots of MBAs following their noses towards the $$$.

That said, time to buy a new lipo battery for that 10 year old laptop in the loft and stick Linux on it - before the lithium miners announce they've sold the next 12 months global supply of Lithium to Altman because he needs it to sleep at night...

view more: next ›

samvines

joined 3 months ago