321
rulebots.txt (lemmy.world)
submitted 4 weeks ago by GroupNebula563@lemmy.world to c/196
you are viewing a single comment's thread
view the rest of the comments
[-] shikogo@pawb.social 70 points 4 weeks ago

I am confused, does this mean Reddit is not going to be searchable on search engines anymore?

[-] Aeri@lemmy.world 66 points 4 weeks ago

oh no, Reddit is like, the only way to have google still be useful.

[-] germanatlas 54 points 4 weeks ago

Funnily enough, google is also the only way to have Reddit be useful.

Their own search function has been nothing but garbage.

[-] morgunkorn@discuss.tchncs.de 43 points 4 weeks ago

That's the catch, Google made a deal with Reddit and remains the only search engine allowed to access its data for indexing. It cuts off every other search engine

[-] Vorticity@lemmy.world 27 points 4 weeks ago

Tell me that there is an anti trust suit over this.

[-] GroupNebula563@lemmy.world 26 points 4 weeks ago

There's a suit over google in general so this may well be part of it

[-] TriflingToad@lemmy.world 3 points 3 weeks ago

really? ddg will show me reddit links, did they have to make a webscraper or something

[-] morgunkorn@discuss.tchncs.de 4 points 3 weeks ago

There's a cutoff date, anything indexed before the robots.txt was changed stays in the index

[-] riodoro1@lemmy.world 31 points 4 weeks ago

We fucked the internet. It’s proprietary now.

[-] GroupNebula563@lemmy.world 11 points 4 weeks ago* (last edited 4 weeks ago)
[-] pupbiru@aussie.zone 8 points 4 weeks ago
[-] Swedneck@discuss.tchncs.de 2 points 3 weeks ago

cat5-o-nine-tails

[-] princessnorah 9 points 4 weeks ago

Good news! Google paid up and still has access I'm pretty sure.

[-] GroupNebula563@lemmy.world 1 points 3 weeks ago

That's bad news, that means the internet is dying

[-] princessnorah 2 points 3 weeks ago

Sorry, the /s was sort of implied.

[-] GroupNebula563@lemmy.world 2 points 3 weeks ago

Ah, sorry. I have trouble with that sometimes :P

[-] GroupNebula563@lemmy.world 9 points 4 weeks ago

Perhaps, likely depends on the crawler though

[-] unexposedhazard@discuss.tchncs.de 12 points 4 weeks ago

Yeah i dont think ignoring robots.txt is even illegal. They can ofcourse just block your crawlers IP but that would be a cat and mouse game that they would lose in the end.

this post was submitted on 21 Aug 2024
321 points (100.0% liked)

196

16243 readers
1679 users here now

Be sure to follow the rule before you head out.

Rule: You must post before you leave.

^other^ ^rules^

founded 1 year ago
MODERATORS