10

Want to wade into the sandy surf of the abyss? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid.

Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned so many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

(Credit and/or blame to David Gerard for starting this.)

you are viewing a single comment's thread
view the rest of the comments
[-] fiat_lux@lemmy.zip 11 points 17 hours ago* (last edited 17 hours ago)

In the latest episode of "behold the power of Mythos" from The Hacker News - Claude Mythos AI Finds 10,000 High-Severity Flaws in Widely Used Software

I distilled it so you don't have to.

Of these vulnerabilities, 6,202 have been classified as high- or critical-severity flaws impacting more than 1,000 open-source projects.

That 10,000 count didn't even survive until paragraph 3.

Subsequent analysis of these [6202] vulnerability candidates has identified that 1,726 are valid true positives.

Ah fuck. 1726. But wait, a bad infographic has entered the ring!

23,019 potential vulnerability candidates

Ok now we're talking.

1,900 Reviewed by external security firms

Wait, what? Why those? Why only those?

1726 confirmed positive

You couldn't even cherry pick the valid ones?

467 reported to maintainers

Where did the other 1259 go? Maybe this other part of the flowchart will go better...

1,129 reported direct to maintainers by Anthropic, at their request (May contain false positives)

1129 + 467 = 1596 total reported to maintainers

Most of them just spammed at open source maintainers. Right. Maybe Anthropic's media release has the goods!

1,752 of those high- or critical-rated vulnerabilities have now been carefully assessed by one of six independent security research firms, or in a small number of cases by ourselves

Slightly lower than the 1900, but ok, whatever.

Of these, 90.6% (1,587) have proved to be valid true positives, and 62.4% (1,094) were confirmed as either high- or critical-severity

1587 is lower than the infographic's 1726 confirmed positives.... But 10% of 10000 high sev is still something, right?

On maintainers’ request, we sometimes disclose bugs directly, without further assessment. We’ve now reported 1,129 such unvetted bugs, of which Mythos Preview estimated that 175 were high- or critical-severity.

I'm sure those maintainers enjoyed that 16% high+ sec rate based on Mythos' own estimations. But wasn't that 1129 the bulk of your reports?

We estimate that we’ve disclosed 530 high- or critical-severity bugs to maintainers so far. There are a further 827 confirmed vulnerabilities (estimated as high- or critical-severity in the same manner) that we’re aiming to disclose as quickly as possible.

530 is only a third of the reports you made to maintainers...

65 of those have been given public advisories

The infographic says 88.

I'd ask if they were massaging their financials like they massaged 65 advisories, but we know they are.

23,019 potential vulnerability candidates of all severities, 65 advisories. If you printed the code out and drunkenly threw darts at it you'd probably hit the same level of accuracy.

[-] schnoopy@awful.systems 1 points 12 minutes ago

1 cve, 100 things that might have mattered.

2 orders of magnitude false positives doesn't sound like an efficient use of labour for finding vulnerabilities but that's just me.

[-] YourNetworkIsHaunted@awful.systems 1 points 37 minutes ago

So what's the over/under on the discrepancies between the numbers that the HN folks got and the official press release numbers being in part due to some kind of hallucinatron hijinks? Because I'm gonna go ahead and predict with confidence that either the HN post was written with a faulty slopbot and they didn't check it or else the presser itself went through the matrix-multiplication-meaning-mangler. Possibly both and all those numbers are similar levels of "more or less right, we swear"

[-] V0ldek@awful.systems 8 points 17 hours ago

All that it tells me is that if you spent the same amount of resources on just fuzzing randomly picked OSS codebases you'd probably get better value for your buck.

[-] froztbyte@awful.systems 4 points 16 hours ago

I’ve seen a handful of security people claim different kinds of yields with some of this shit. I haven’t gone to read up in depth but I wouldn’t be too surprised a lot of them run around with unstated assumptions/provisos in their thonkposts (this shit is expensive (for research volume) and only some people can afford the science experiments)

Got a list of a couple of names I’m keeping an eye on as the first tokenprice-pocalypse (that needs a better word) takes place

[-] CinnasVerses@awful.systems 6 points 16 hours ago

Anthropic (who own Claude Code) are hoping to IPO this year.

[-] froztbyte@awful.systems 5 points 16 hours ago* (last edited 1 hour ago)

it continues to be amazing to me that this is the “high impact” area they’re going with: even if their analysis systems are better (and frankly I still don’t buy this wholesale, there’s a whole rest of the owl being handwaved[0]), bug-elimination is by definition diminishing returns so you can only fanfare like this the first time

[0] - having fucking gigantic budgets to throw at running a parse of every single repo and every test condition/simulation you wish to certainly does help a hell of a lot, even moreso when you can shell out to a half-dozen second stage review corps…

[-] fiat_lux@lemmy.zip 5 points 8 hours ago

I honestly can't think of anywhere else they can go with it. They need:

  • something with a binary pass/fail to claim solid numbers at all
  • something where copy paste is a viable strategy
  • sufficient public training data from which to derive that copy paste strategy, and,
  • scary enough consequences to frame any success as impact.

Code security review is probably the only way you can realistically achieve all four. But they're not even coming close. Not even with access to "partner" black box repositories coupled with under-resourced open source packages.

And they know they're not succeeding, because they wouldn't bury that 530 high+ sev number deep in the middle of the press release if they thought it were impressive.

Luckily for them, the slop "news" blogs will parrot numbers like 10k, and their only strength - model collapse as a marketing strategy - can handwave the rest of that owl.

this post was submitted on 24 May 2026
10 points (100.0% liked)

TechTakes

2583 readers
61 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago
MODERATORS