178
Sites scramble to block ChatGPT web crawler after instructions emerge
(arstechnica.com)
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
Is it possible that they offloaded the scraping to a different company to avoid direct litigation now theyre out in the open? To say "we didn't scrape your website, and you can't prove it."
Like DDG, Ecosia, Qwant use Bing for their data Or how feds buy data from data brokers. Outsource the dirty job like every tech company does and shift the blame if caught doing something unlawful.
It seems they are trying to garner some positive PR after they scraped through everything without anyone noticing.
I absolutely believe a lot of companies outsource simply because they don't want to build an internal organ to do it. Even in government, despite what Conservatives believe, most organization heads are pretty focused on core competency and press to use outsourced resources. This latter also promoted by heavy lobbying by the companies selling the services.
This is a situation of "never attribute to malice that which can be easily explained by stupidity." Sure, some are motivated by malice or subterfuge, but most are probably just buying services because they have other things they'd rather focus on.
Why would they be concerned about litigation? As far as I know, scraping is completely legal in most/all countries (including the US, which I'm more familiar with and they're headquartered out of), as long as you're respecting copyright and correctly handling PII (which they claim to be making an effort on).