Should lemmy.ml block chatgpt scraping in robots.txt? (lemmy.ml)

submitted 1 year ago by GnuLinuxDude@lemmy.ml to c/meta@lemmy.ml

13 comments fedilink hide all child comments

Some context about this here: https://arstechnica.com/information-technology/2023/08/openai-details-how-to-keep-chatgpt-from-gobbling-up-website-data/

the robots.txt would be updated with this entry

User-agent: GPTBot
Disallow: /

Obviously this is meaningless against non-openai scrapers or anyone who just doesn't give a shit.

you are viewing a single comment's thread
view the rest of the comments

[-] Geist_@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

... It's probably going to recommend paid and non-FOSS apps and programs just on the basis that those companies probably will pay to be the top suggestions. Just like google ads. So no, I don't think that's a good enough reason. They can still scrape wiki's if they need info on FOSS sites, imo. Those shouldn't (?) block AI's and other aggregators.

this post was submitted on 20 Aug 2023

36 points (100.0% liked)

lemmy.ml meta

1406 readers

1 users here now

Anything about the lemmy.ml instance and its moderation.

For discussion about the Lemmy software project, go to !lemmy@lemmy.ml.

founded 3 years ago

MODERATORS

nutomic@lemmy.ml