What are Fedi Admins doing to block Meta scrapers? (media.infosec.exchange)

submitted 1 month ago* (last edited 1 month ago) by thenexusofprivacy to c/fediverse@piefed.social

12 comments fedilink hide all child comments

cross-posted from: https://infosec.exchange/users/thenexusofprivacy/statuses/115012347040350824

As you've probably seen or heard Dropsitenews has published a list (from a Meta whistleblower) of "the roughly 100,000 top websites and content delivery network addresses scraped to train Meta's proprietary AI models" -- including quite a few fedi sites. Meta denies everything of course, but they routinely lie through their teeth so who knows. In any case, whether the specific details in the report are accurate, it's certainly a threat worth thinking about.

So I'm wondering what defenses fedi admins are using today to try to defeat scrapers: robots.txt, user-agent blocking, firewall-level blocking of ip ranges, Cloudflare or Fastly AI scraper blocking, Anubis, stuff you don't want to disclose ... @deadsuperhero@social.wedistribute.org has some good discussion on We Distribute. It would b e very interesting to hear what various instances are doing.

And a couple of more open-ended questions:

Do you feel like your defenses against scraping are generally holding up pretty well?

Are there other approaches that you think might be promising that you just haven't had the time or resources to try?

Do you have any language in your terms of servive that attempts to prohibit training for AI?

Here's @FediPact's post with a link to the Dropsitenews report and (in the replies) a list of fedi instances and CDNs that show up on the list.

https://cyberpunk.lol/@FediPact/114999480874284493

@fediverse @fediversenews

#MastoAdmin #Meta #FediPact

you are viewing a single comment's thread
view the rest of the comments

[-] rhythmisaprancer@piefed.social 3 points 1 month ago

@originalucifer@moist.catsweat.com in case you are interested

this post was submitted on 11 Aug 2025

28 points (100.0% liked)

Fediverse

1250 readers

82 users here now

Downvote are limited to members of this community

Welcome!

Can you imagine, years ago how the internet was before? We know Facebook, Twitter, Tiktok, Youtube. We knew blogger, Tumblr, Skyrock... and long before, it was the forum era as phpBB..and mail-lists.

And now with ActivityPub, we are reshaping the web, and achieving much with lots of freedom. So thank you all, and welcome 🤟😁

Our thread

Wiki

How to present the threadiverse to redditors
PieFed overview
[Mbin overview]

Ressources

Related communities

!newcomers@piefed.zip Are you new on the threadiverse ? Come and say hello :3
!newcommunities@lemmy.word for discovering the latest communities :)
!testfediverse@jlai.lu (for testing on lemmy)
!playground@piefed.social (for testing on PieFed)
!fedibridge@lemmy.dbzer0.com
!newtolemmy@lemmy.ca
!communitypromo@lemmy.ca
!fedigrow@lemmy.zip

If you want to donate, double check on the official website and report any problem to mod team

Social network

Threadiverse

Flarum | git | donation
Friendica | git
Lemmy | git | donate
Mbin | git
NodeBB | git
PieFed | git | donate

Blog

Wordpress
WriteFreely | git | donate

Microblog

GoToSocial | git | donation
IceShrimp | git
Mastodon | git | donate

Event

Mobilizon | git

Mediaverse

Audio

Funkwhale | git | donate

Streaming/live

Owncast | git | donate

Book

Bookwyrm | git | donate

Culture review

NeoDB | git | donation

Picture

Pixelfed | git
Vernissage | git

Short-video

Loops | git

Video

Peertube | git | donate

Image Credits :
Avatar : Wikipedia Eukombos
Banner : David Revoy licence : CC-BY-4.0

Rules

Moderation process
We all make mistakes,

If your comment is reported, and brings up a complex issue, we will reach out to you and ask you to rephrase it.

Our goal, is to create a serene space for discussion. Nothing more.

If the post isn't edited to remove hurtful language element, we will have to remove it. It would be a shame because your comment was interesting and you took some time to write it.

In case of xenophobia, racism, transphobia, homophobia or harassment, it will be a permanent ban.

Remember the human: no insults, no aggression, no harassment between users. 🫂
Xenophobia, racism, transphobia and other forms of discrimination are forbidden.🌈
No duplication. No spam. Avoid paywalls if possible or do a little summary.☔
Use the "Report" function to report content that contravenes the rules.🏴
Content must be related to Fediverse.

founded 6 months ago

MODERATORS

fxomt@piefed.social

Snoopy@piefed.social

Snoopy@tarte.nuage-libre.fr