Stubsack: weekly thread for sneers not worth an entire post, week ending 31st May 2026 (awful.systems)

submitted 1 month ago by BlueMonday1984@awful.systems to c/techtakes@awful.systems

272 comments fedilink hide all child comments

Want to wade into the sandy surf of the abyss? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid.

Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned so many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

(Credit and/or blame to David Gerard for starting this.)

you are viewing a single comment's thread
view the rest of the comments

[-] fiat_lux@lemmy.zip 16 points 1 month ago* (last edited 1 month ago)

In the latest episode of "behold the power of Mythos" from The Hacker News - Claude Mythos AI Finds 10,000 High-Severity Flaws in Widely Used Software

I distilled it so you don't have to.

Of these vulnerabilities, 6,202 have been classified as high- or critical-severity flaws impacting more than 1,000 open-source projects.

That 10,000 count didn't even survive until paragraph 3.

Subsequent analysis of these [6202] vulnerability candidates has identified that 1,726 are valid true positives.

Ah fuck. 1726. But wait, a bad infographic has entered the ring!

23,019 potential vulnerability candidates

Ok now we're talking.

1,900 Reviewed by external security firms

Wait, what? Why those? Why only those?

1726 confirmed positive

You couldn't even cherry pick the valid ones?

467 reported to maintainers

Where did the other 1259 go? Maybe this other part of the flowchart will go better...

1,129 reported direct to maintainers by Anthropic, at their request (May contain false positives)

1129 + 467 = 1596 total reported to maintainers

Most of them just spammed at open source maintainers. Right. Maybe Anthropic's media release has the goods!

1,752 of those high- or critical-rated vulnerabilities have now been carefully assessed by one of six independent security research firms, or in a small number of cases by ourselves

Slightly lower than the 1900, but ok, whatever.

Of these, 90.6% (1,587) have proved to be valid true positives, and 62.4% (1,094) were confirmed as either high- or critical-severity

1587 is lower than the infographic's 1726 confirmed positives.... But 10% of 10000 high sev is still something, right?

On maintainers’ request, we sometimes disclose bugs directly, without further assessment. We’ve now reported 1,129 such unvetted bugs, of which Mythos Preview estimated that 175 were high- or critical-severity.

I'm sure those maintainers enjoyed that 16% high+ sec rate based on Mythos' own estimations. But wasn't that 1129 the bulk of your reports?

We estimate that we’ve disclosed 530 high- or critical-severity bugs to maintainers so far. There are a further 827 confirmed vulnerabilities (estimated as high- or critical-severity in the same manner) that we’re aiming to disclose as quickly as possible.

530 is only a third of the reports you made to maintainers...

65 of those have been given public advisories

The infographic says 88.

I'd ask if they were massaging their financials like they massaged 65 advisories, but we know they are.

23,019 potential vulnerability candidates of all severities, 65 advisories. If you printed the code out and drunkenly threw darts at it you'd probably hit the same level of accuracy.

[-] V0ldek@awful.systems 11 points 1 month ago

All that it tells me is that if you spent the same amount of resources on just fuzzing randomly picked OSS codebases you'd probably get better value for your buck.

[-] froztbyte@awful.systems 6 points 1 month ago

I’ve seen a handful of security people claim different kinds of yields with some of this shit. I haven’t gone to read up in depth but I wouldn’t be too surprised a lot of them run around with unstated assumptions/provisos in their thonkposts (this shit is expensive (for research volume) and only some people can afford the science experiments)

Got a list of a couple of names I’m keeping an eye on as the first tokenprice-pocalypse (that needs a better word) takes place

[-] blakestacey@awful.systems 10 points 1 month ago

Vibenarok

[-] macroplastic@sh.itjust.works 8 points 1 month ago

Eschatoken

[-] froztbyte@awful.systems 3 points 1 month ago

ooh, brava!

[-] o7___o7@awful.systems 3 points 1 month ago

Perfection

[-] CinnasVerses@awful.systems 8 points 1 month ago

Anthropic (who own Claude Code) are hoping to IPO this year.

[-] schnoopy@awful.systems 7 points 1 month ago

1 cve, 100 things that might have mattered.

2 orders of magnitude false positives doesn't sound like an efficient use of labour for finding vulnerabilities but that's just me.

[-] froztbyte@awful.systems 6 points 1 month ago* (last edited 1 month ago)

it continues to be amazing to me that this is the “high impact” area they’re going with: even if their analysis systems are better (and frankly I still don’t buy this wholesale, there’s a whole rest of the owl being handwaved[0]), bug-elimination is by definition diminishing returns so you can only fanfare like this the first time

[0] - having fucking gigantic budgets to throw at running a parse of every single repo and every test condition/simulation you wish to certainly does help a hell of a lot, even moreso when you can shell out to a half-dozen second stage review corps…

[-] fiat_lux@lemmy.zip 6 points 1 month ago

I honestly can't think of anywhere else they can go with it. They need:

something with a binary pass/fail to claim solid numbers at all
something where copy paste is a viable strategy
sufficient public training data from which to derive that copy paste strategy, and,
scary enough consequences to frame any success as impact.

Code security review is probably the only way you can realistically achieve all four. But they're not even coming close. Not even with access to "partner" black box repositories coupled with under-resourced open source packages.

And they know they're not succeeding, because they wouldn't bury that 530 high+ sev number deep in the middle of the press release if they thought it were impressive.

Luckily for them, the slop "news" blogs will parrot numbers like 10k, and their only strength - model collapse as a marketing strategy - can handwave the rest of that owl.

[-] YourNetworkIsHaunted@awful.systems 3 points 1 month ago

So what's the over/under on the discrepancies between the numbers that the HN folks got and the official press release numbers being in part due to some kind of hallucinatron hijinks? Because I'm gonna go ahead and predict with confidence that either the HN post was written with a faulty slopbot and they didn't check it or else the presser itself went through the matrix-multiplication-meaning-mangler. Possibly both and all those numbers are similar levels of "more or less right, we swear"

[-] fiat_lux@lemmy.zip 9 points 1 month ago

It's almost certainly a slop article, but to its credit, it did accurately cite the numbers from the official Anthropic flowchart image. (Also, just to be clear, this is an Indian "#1 cybersecurity news" company doing an SEO piggyback off the orange site, not the orange site itself).

However, Anthropic's numbers in their official post do not match their own flowchart, despite being presented together. My assumption is they made the image, post, and yet another fucking dashboard earlier, then failed to keep them all in sync when someone revised the numbers up or down.

The dashboard timestamp claims it's showing the latest numbers as of 2026-05-22 10:27 PT (T17:27Z) with values that match the numbers in the image. The post created timestamp gives 2026-05-20 T14:07:48Z, and it was later updated at 2026-05-22 T20:37:40Z. I'm guessing that update was to swap the image, and the fact that some of the values are also quoted in the text was completely overlooked. Or vice versa.

It's the kind of attention to detail I've come to expect from Anthropic.

this post was submitted on 24 May 2026

20 points (100.0% liked)

TechTakes

2613 readers

68 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 3 years ago

MODERATORS

dgerard@awful.systems