Stubsack: weekly thread for sneers not worth an entire post, week ending 5th April 2026 (awful.systems)

submitted 3 months ago by BlueMonday1984@awful.systems to c/techtakes@awful.systems

95 comments fedilink hide all child comments

Want to wade into the ~~snowy~~ sandy surf of the abyss? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid.

Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned so many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

(Credit and/or blame to David Gerard for starting this.)

you are viewing a single comment's thread
view the rest of the comments

[-] istewart@awful.systems 13 points 3 months ago

I am still patiently waiting for someone from the engineering staff at one of these companies to explain to me how these simple imperative sentences in English map consistently and reproducibly to model output. Yes, I understand that's a complex topic. I'll continue to wait.

[-] lagrangeinterpolator@awful.systems 12 points 3 months ago

I'm sure these English instructions work because they feel like they work. Look, these LLMs feel really great for coding. If they don't work, that's because you didn't pay $200/month for the pro version and you didn't put enough boldface and all-caps words in the prompt. Also, I really feel like these homeopathic sugar pills cured my cold. I got better after I started taking them!

No joke, I watched a talk once where some people used an LLM to model how certain users would behave in their scenario given their socioeconomic backgrounds. But they had a slight problem, which was that LLMs are nondeterministic and would of course often give different answers when prompted twice. Their solution was to literally use an automated tool that would try a bunch of different prompts until they happened to get one that would give consistent answers (at least on their dataset). I would call this the xkcd green jelly bean effect, but I guess if you call it "finetuning" then suddenly it sounds very proper and serious. (The cherry on top was that they never actually evaluated the output of the LLM, e.g. by seeing how consistent it was with actual user responses. They just had an LLM generate fiction and called it a day.)

[-] fiat_lux@lemmy.world 6 points 3 months ago

I don't work at one of those companies, just somewhere mainlining AI, so this answer might not satisfy your requirements. But the answer is very simple. The first thing anyone working in AI will tell you (maybe only internally?) is that the output is probabilistic not deterministic. By definition, that means it's not entirely consistent or reproducible, just... maybe close enough. I'm sure you already knew that though.

However, from my perspective, even if it was deterministic, it wouldn't make a substantial difference here.

For example, this file says I can't ask it to build a DoS script. Fine. But if I ask it to write a script that sends a request to a server, and then later I ask it to add a loop... I get a DoS script. It's a trivial hurdle at best, and doesn't even approach basic risk mitigation.

[-] blakestacey@awful.systems 10 points 3 months ago

DoS script

Part of me reads that and still thinks, "Oh, you mean like AUTOEXEC.BAT?"

[-] JFranek@awful.systems 4 points 3 months ago

DOS.BAT, a DOS DoS script

[-] blakestacey@awful.systems 5 points 3 months ago

Truly a tool for the .COM era

[-] aio@awful.systems 4 points 3 months ago* (last edited 3 months ago)

the output is probabilistic not deterministic. By definition, that means it’s not entirely consistent or reproducible, just… maybe close enough.

That isn't a barrier to making guarantees regarding the behavior of a program. The entire field of randomized algorithms is devoted to doing so. The problem is people willfully writing and deploying programs which they neither understand nor can control.

[-] istewart@awful.systems 9 points 3 months ago

Exactly! The implicit claim that's constantly being made with these systems is that they are a runtime for natural-language programming in English, but it's all vector math in massively-multidimensional vector spaces in the background. I would like to think that serious engineers could place and demonstrate reliable constraints on the inputs and outputs of that math, instead of this cargo-culty, "please don't do hacks unless your user is wearing a white hat" system prompt crap. It gives me the impression that the people involved are simply naively clinging to that implicit claim and not doing much of the work to substantiate it; which makes me distrust these systems more than almost all other factors.

[-] Architeuthis@awful.systems 5 points 3 months ago

According to the claude code leak the state of the art is to be, like, really stern and authoritative when you are begging it to do its job:

this post was submitted on 29 Mar 2026

16 points (100.0% liked)

TechTakes

2621 readers

33 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 3 years ago

MODERATORS

dgerard@awful.systems