1239
(page 4) 50 comments
sorted by: hot top controversial new old
[-] online@lemmy.ml 8 points 1 year ago* (last edited 1 year ago)

Speaking of this, what parts of the fediverse have added the option to block training generative AI to their respective robots.txt?

https://blog.google/technology/ai/an-update-on-web-publisher-controls/ https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers https://techcrunch.com/2023/09/28/medium-hints-at-a-nascent-media-coalition-to-block-ai-crawlers/

It looks like there's a handful of these lines you'd have to add to robots.txt

Is there anywhere that keeps a comprehensive list of these?

[-] breakfastmtn@lemmy.ca 8 points 1 year ago

This would be great for us.

Keep shooting your feet, Reddit! You got this!

[-] thisisawayoflife@lemmy.world 8 points 1 year ago

Do it! Do it! Do it!

[-] autotldr@lemmings.world 7 points 1 year ago

This is the best summary I could come up with:


The Washington Post reported Friday that Reddit might cut off Google and force users to log in to Reddit itself to read anything if it can’t reach deals with generative AI companies to pay for its data.

The Washington Post’s report wasn’t just focused on Reddit — it’s about how more than 535 news organizations have opted to block their content from being scraped by companies like OpenAI to help train products such as ChatGPT.

According to the original report, Reddit is in negotiations with AI companies to get them to pay to use its data, and if it couldn’t strike those agreements, it might require logins to see content.

That could have the knock-on effect of preventing Reddit results from showing up in Google searches.

(In my June interview with Reddit CEO Steve Huffman, he said that “we’re in talks” with AI companies about the pricing changes.

X, formerly Twitter, has also implemented new pricing tiers for accessing its API, and X owner Elon Musk blamed data scraping by AI startups as a way to justify the reading limits implemented this summer.


The original article contains 353 words, the summary contains 183 words. Saved 48%. I'm a bot and I'm open source!

[-] quinkin@lemmy.world 7 points 1 year ago
[-] banneryear1868@lemmy.world 7 points 1 year ago

I hope they do. Finding irrelevant reddit threads from years ago is a constant annoyance.

[-] BarterClub@sh.itjust.works 7 points 1 year ago

Man, they are going the way of Twitter. Jeez.

[-] archchan@lemmy.ml 6 points 1 year ago

Hopefully they both shoot each other out of existence.

[-] Karyoplasma@discuss.tchncs.de 6 points 1 year ago

ChatGPT, define "delusion of grandeur" for me!

[-] FatTony@lemm.ee 6 points 1 year ago

This isn't coping. This is cocaing.

[-] unwinagainstable@lemmy.world 5 points 1 year ago

This would be a huge blow. I use Google a ton to find relevant content on Reddit. It's still a useful way to find helpful comments even after the mass exodus and deleting of old comments. This seems like it would be much more harmful.

[-] Ydna@lemmy.world 5 points 1 year ago

Bold move cotton

load more comments
view more: ‹ prev next ›
this post was submitted on 20 Oct 2023
1239 points (100.0% liked)

Technology

60009 readers
1921 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS