455
submitted 4 months ago* (last edited 4 months ago) by TriflingToad@sh.itjust.works to c/showerthoughts@lemmy.world

Had this thought the other day and tbh it's horrifying to think about the implications of one, or God forbid all, of them going down.
Stackoverflow too but that only applies to nerds haha

top 50 comments
sorted by: hot top controversial new old
[-] jlh@lemmy.jlh.name 71 points 4 months ago

All the more reason to data horde. The costs of storing these libraries are going down, and it is likely that everyone can have their own copy of it all in the near future.

https://annas-archive.org/blog/critical-window.html

[-] lud@lemm.ee 24 points 4 months ago

The amount of data is also increasing constantly and by a lot.

[-] dharmacurious@slrpnk.net 9 points 4 months ago

The article they linked goes over that. It's a really good read

[-] grue@lemmy.world 67 points 4 months ago

One of those is not a non-profit foundation, and that's a Problem.

[-] kamenlady@lemmy.world 19 points 4 months ago

And that one is not really comparable to the library of Alexandria.

[-] bjoern_tantau@swg-empire.de 21 points 4 months ago

But it would probably be the most interesting to future archeologists. At least all the noncommercial videos people make about their lives. The "you" part of YouTube.

load more comments (1 replies)
[-] Tedesche@lemmy.world 51 points 4 months ago

I think it’s a bit ironic that Wikipedia hasn’t succumbed to the modern era of misinformation the way other information sources have, particularly given the warnings about it that have been given in the past. Not saying those warnings aren’t warranted, just that the way things have played out is counter to said expectations.

[-] JubilantJaguar@lemmy.world 28 points 4 months ago

There's an obvious reason for that. Wikipedia is owned by a nonprofit foundation and does not accept advertising.

[-] FundMECFSResearch 18 points 4 months ago* (last edited 4 months ago)

It definitely has, just not to as large a scale.

In practice it’s ran like a heirarchical aristocracy, where a admins control articles they care about and are very picky about the changes they allow.

One article about an illness contains false information related to alternative medicine “treatments” and I edited it, this was removed by the person who made most of the page. I got into an argument with them, and turns out they have the same username and come from the same country as an account on other platforms selling alternative medicine products, which are subtly advertised on the page they control. They also are a wikipedia admin.

Anyways I reported this to the admin team, and my report was immediately deleted by the admin I was reporting, and I got a three year ban. Mind you I have over a thousand wikipedia edits and have made some big contributions so this was quite annoying.

And this is far from the only incident. The people who are most likely to edit wikipedia pages are those who really care about, or could really benefit from the topic. So you end up having situations where companies hire agencies to improve their image by changing the wikipedia article about them and their products, same thing for celebrities.

[-] JubilantJaguar@lemmy.world 6 points 4 months ago

Interesting anecdote. Though to judge by your username, it seems you may have an agenda yourself.

So you end up having situations where companies hire agencies to improve their image by changing the wikipedia article about them and their products, same thing for celebrities

This is a major problem that takes up a lot of time for the editors. It explains some of their trigger-happiness.

That said, you have a valid point. I once tried to water down what I considered to be excessively POV language in an article about diet. This earned me an official warning for "extremism" or "conspiracism" or whatever. My impressive account pedigree also counted for nothing. So there's definitely a bit of the political bias, the power-tripping and gatekeeping that you see in any online community. But it's a bit of a conundrum too, because they are fighting an uphill battle against people with strong incentives and sometimes money too.

[-] FundMECFSResearch 8 points 4 months ago* (last edited 4 months ago)

Interesting anecdote. Though to judge by your username, it seems you may have an agenda yourself.

This wasn’t the ME/CFS article (the illness I am personally disabled by) and anyways all this happened before I became disabled.

Anyways my ban is over now, but I can’t get myself to edit wikipedia anymore. It was a pretty shitty experience and I don’t wanna go back.

And it wasn’t the only one. So much NPOV-violating stuff on most the fringe articles and whenever you edit to make more neutral tone or you remove something unsupported by citations you end up in an insufferable straw man argument chain on the talk page.

The main fun part is filling out abandoned articles and making new articles yourself. But anything showing problems in other people’s work becomes really tiring really quick with all the talk page nonsense and endless reverts.

[-] daddy32@lemmy.world 5 points 4 months ago

What's a shame. No way to report him higher in hierarchy?

[-] Mwa@lemm.ee 4 points 4 months ago

There is people who watch most popular articles,its not rlly misinformation.

[-] Wiz@midwest.social 33 points 4 months ago

Let's help PeerTube replace YouTube.

[-] drosophila 23 points 4 months ago

I would add Project Gutenberg and Open Street Map to your list.

[-] Gradually_Adjusting@lemmy.world 23 points 4 months ago

Alexandria was important in its time, but in terms of the volume and quality of information we keep on Wikipedia alone, it is a mosquito in the Taj Mahal.

[-] sit@lemmy.dbzer0.com 18 points 4 months ago

You can’t rely on YouTube videos staying up over time.

Better download what you want might want to look up again

[-] Wade@lemmy.world 11 points 4 months ago

Can't count on the library of Alexandria staying up over time either

[-] ellen_musk_0x@lemm.ee 3 points 4 months ago

I think we also overestimate the valve if what would have been at Alexandria.

Considering everything would have been hand copied/transcribed back then, and his expensive that would have been, the selection bias would be massive.

I doubt it could compare to Wikipedia.

[-] einlander@lemmy.world 16 points 4 months ago
[-] TriflingToad@sh.itjust.works 10 points 4 months ago

wikibooks is cool, had no idea that existed. I'm sure next time I get curious at 3am I'll end up there reading about the history of 'vectors' or some other random stuff lol

[-] TriflingToad@sh.itjust.works 16 points 4 months ago

There was a video I saw (I think it was hank or John Green), where they talked about the implications of twitter being deleted during the start of Elon. They pulled out a joke book they bought of "1000 twitter posts" and said how it would be the only recorded proof they (personally) had of what twitter was.

It's terrifying thinking of just how much information is just being put in the hands of companies that don't care or just on old hard drives about to give out due to funding. I wish there was a way to backup a random part of the information automatically, like a "I'll give you a terabyte of backup, make the most of it" automatically choosing what isn't backuped already.

Also add reddit too, the amount of times I've searched a question and went through 2024 website crap then went back to the search and added "site:reddit" into DuckDuckGo and got an answer instantly.

[-] UltraGiGaGigantic@lemmy.ml 11 points 4 months ago
[-] NickwithaC@lemmy.world 6 points 4 months ago

.ml

ಠ⁠_⁠ಠ

[-] SubArcticTundra@lemmy.ml 10 points 4 months ago

There pught to be a decentralized archive of YT. ...and Archive

[-] 9point6@lemmy.world 5 points 4 months ago* (last edited 4 months ago)

The problem with YouTube is the sheer amount of storage required. Just going by the 10 Exabyte figure mentioned elsewhere in the thread, there are about 25,000 fediverse servers across all services in total IIRC, so even if you evenly split that 10EB across all of them, they would still need 400TB each just to cover what we have today.

Famously YouTube needs a petabyte of fresh storage every day, so each of those servers would need to be able to accept an additional 40GB a day.

Realistically though, any kind of decentralised archive wouldn't start with 25,000 servers, so the operational needs are going to be significantly higher in reality

[-] coronach@lemmy.sdf.org 3 points 4 months ago

I know it's totally subjective, but I wonder how much "non-trash" YouTube is uploaded each day?

load more comments (1 replies)
[-] FundMECFSResearch 9 points 4 months ago
[-] possiblylinux127@lemmy.zip 1 points 4 months ago

Is it still around? I though they were arrested by Interpol

[-] FundMECFSResearch 4 points 4 months ago

Yeah it’s got loads of domains. It’s never been gone.

[-] possiblylinux127@lemmy.zip 2 points 4 months ago* (last edited 4 months ago)

Then who is behind it? The original people are in prison

Keep in mind it could be a honey pot. When using Tor make sure you turn off JavaScript.

[-] FundMECFSResearch 2 points 4 months ago

It’s not like other services. Books are only a couple mb so it’s really easy to reupload the entire website.

Check out the piracy lemmy community megathread.

[-] possiblylinux127@lemmy.zip 8 points 4 months ago

I wish that the Internet Archive would focus on allowing the public to store data. Distribute the network over the world.

[-] csm10495@sh.itjust.works 4 points 4 months ago

In theory this could be true. In practice, data would be ripe for poisoning. It's like the idea of turning every router into a last mile CDN with a 20TB hard drive.

Then you have to think about security and not letting the data change from what was originally given. Idk. I'm sure something is possible, but without a real 'omph' nothing big happens.

[-] possiblylinux127@lemmy.zip 2 points 4 months ago

The data would be hashed so any changes would be thrown out.

[-] csm10495@sh.itjust.works 1 points 4 months ago

Hashed by whom? Who has the source of truth for the hashes? How would you prevent it from being poisoned? .. or are you saying a non-distributed (centralized) hash store?

If centralized: you have a similar problem to IA today. If not centralized: How would you prevent poisoning? If enough distributed nodes say different things, the truth can be lost.

[-] possiblylinux127@lemmy.zip 1 points 4 months ago

This is a topic that is pretty well tested. Basically the data is validated when received.

For instance in IPFS data is tracked by its hash. You request something by a CID which is just a hash.

There are other distributed networks and they all have there own ways of protecting against attacks. Usually an attack requires a huge amount of resources.

load more comments (1 replies)
load more comments (1 replies)
[-] antonim@lemmy.dbzer0.com 3 points 4 months ago

Huh? The public can store data on IA just fine. I've uploaded dozens of public-domain books there.

[-] Badoker@lemmy.nz 6 points 4 months ago

But all the data is on IA's servers. In the event their servers go down for good, that's it. There's no way to self host parts of the Archive fediverse style.

[-] antonim@lemmy.dbzer0.com 2 points 4 months ago

That's true, but organising and managing such a distributed form of IA would probably be a nightmare of a job. I've seen many people suggest that to IA, but they seem to be very very reluctant about the idea.

[-] possiblylinux127@lemmy.zip 2 points 4 months ago

Distributed systems have come a long way. It would be possible

[-] Kolanaki@yiffit.net 7 points 4 months ago

Man, it's gonna suck when Wikipedia burns to the ground twice.

They can't burn all of us (datahoarders)!

[-] antonim@lemmy.dbzer0.com 4 points 4 months ago* (last edited 4 months ago)

If we're going to stick to ancient Greek references, one of these is closer to the modern day Augean stables.

[-] Kcg@lemmy.ml 2 points 4 months ago

AnnasArchive.org is good at backing up knowledge on a large scale. They also have torrents to spread it around a bit.

[-] spaduf@slrpnk.net 2 points 4 months ago

And they are just as fragile.

[-] MonkderVierte@lemmy.ml 1 points 4 months ago

One of them isn't like the others.

load more comments
this post was submitted on 23 Nov 2024
455 points (100.0% liked)

Showerthoughts

33412 readers
1722 users here now

A "Showerthought" is a simple term used to describe the thoughts that pop into your head while you're doing everyday things like taking a shower, driving, or just daydreaming. The most popular seem to be lighthearted clever little truths, hidden in daily life.

Here are some examples to inspire your own showerthoughts:

Rules

  1. All posts must be showerthoughts
  2. The entire showerthought must be in the title
  3. No politics
    • If your topic is in a grey area, please phrase it to emphasize the fascinating aspects, not the dramatic aspects. You can do this by avoiding overly politicized terms such as "capitalism" and "communism". If you must make comparisons, you can say something is different without saying something is better/worse.
    • A good place for politics is c/politicaldiscussion
  4. Posts must be original/unique
  5. Adhere to Lemmy's Code of Conduct and the TOS

If you made it this far, showerthoughts is accepting new mods. This community is generally tame so its not a lot of work, but having a few more mods would help reports get addressed a little sooner.

Whats it like to be a mod? Reports just show up as messages in your Lemmy inbox, and if a different mod has already addressed the report, the message goes away and you never worry about it.

founded 2 years ago
MODERATORS