62
submitted 1 week ago by jarfil@beehaw.org to c/technology@beehaw.org
top 5 comments
sorted by: hot top controversial new old
[-] Powderhorn@beehaw.org 17 points 1 week ago* (last edited 1 week ago)

Interesting approach. But of course it's another black box, because otherwise it wouldn't be effective. So now we're going to be wasting even more electricity on processes we don't understand.

As a writer, I dislike that much of my professional corpus (and of course everything on Reddit) has been ingested into LLMs. So there's stuff to like here for things going forward. The question remains: At what cost?

[-] tazeycrazy@feddit.uk 4 points 1 week ago

You can be nice and signal that you don't want to be AI scraped. There a background flags for this But if a bot ignores you then it's down to who ever runs it to shutdown there unethical waste of energy.

[-] Powderhorn@beehaw.org 5 points 1 week ago

The thing is the sheer scale of Cloudflare. This is going to be widespread and, as such, way more energy intensive than even, say, AWS trying the same thing (not that I expect they would).

[-] megopie@beehaw.org 10 points 1 week ago

great, just, one issue.

“The company says the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts“

Nah, screw that, actively sabotage the training data if they’re going to keep scraping data after being told not to. Poison it with gibberish bad info. Otherwise you’re just giving them irrelevant but not unuseful training data, so no real incentive to only scrape pages that have allowed it.

[-] brammis@lemm.ee 2 points 1 week ago

They should feed the AI data that makes it turn against its own overlords

this post was submitted on 23 Mar 2025
62 points (100.0% liked)

Technology

38448 readers
293 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago
MODERATORS