66
Internet Archive played crucial role in tracking shady CDC data removals
(arstechnica.com)
Studies, research findings, and interesting tidbits from the ever-expanding scientific world.
Subcommunities on Beehaw:
Be sure to also check out these other Fediverse science communities:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
There are alternative archival sites, some that operate outside US tampering, but IA is certainly the primary.
Unfortunately, the IA is absolutely massive. Anyone backing up anything is just grabbing what is personal to them, hopefully in a way that the pieces can be authenticated and re-assembled, but unlike Wikipedia we aren’t talking about copies of the whole thing, not even close. I think they are near or recently over 100 petabytes? Much will be lost if/when the IA is eventually targeted and disabled for whatever reason they come up with.
If the IA were to be backed up at any meaningful scale, I would think to ask the British to encourage their Museum to embrace the stereotype that they readily take everything, and apply it to the internet. America can no longer be trusted to house any accurate history of anything.
To be clear, other archive sites that take snapshots of web pages are not really alternatives to the Internet Archive, which (importantly) allows uploading of arbitrary data for preservation. One example of this is mentioned in the article:
https://archive.org/details/20250128-cdc-datasets
Fair, but that just makes it worse. Means we really do have a single point of failure. Alexandria anyone?