95
submitted 1 day ago by sunaurus@lemm.ee to c/meta@lemm.ee

Hey folks!

I am looking for feedback from active lemm.ee users on what you all value when it comes to images on Lemmy. I'll go into a bit of detail about what our options are, and then I would ask you to voice your opinion about the issue in the comments.

First, some context for those who don't know. Lemmy software can be configured to handle images in three different ways:

  1. Store images locally - whenever an external image is posted somewhere, lemm.ee will download a permanent local copy. When you view posts, you are seeing our local copy of the image.
  2. Proxy all images - similarly to the first option, lemm.ee will download a local copy of external images, however, this copy is temporary. It will be automatically deleted shortly after, and if users open the relevant post/comment again in the future, there will be another attempt to download a temporary copy at that point.
  3. Pass through external images directly - lemm.ee never downloads any external images, users will always connect directly to the source servers to load the images.

There are pros and cons to each configuration.

Storing images locally

Benefits:

  1. Your IP address is never leaked to external image hosts, as you never connect directly to the source server. External image hosts only see the IP address of the lemm.ee server.
  2. External servers don't become bottlenecks for opening lemm.ee posts. If an external server is slow, it won't matter, because the image is always available locally

Downsides:

  1. As time goes on, our storage will fill up with hundreds of gigabytes of useless images, most of which will never be viewed again after the relevant posts fall off the front page.
  2. Many big external image hosts will rate limit bigger Lemmy servers, causing broken images when we fail to make a local copy.
  3. Crucially: some people love to spend their time uploading illegal content to online servers. There are tools to try and filter out such content, but these are not perfect. The end result is that there is a high chance of some content like this inadvertently reaching lemm.ee storage and staying there permanently. This downside is why lemm.ee has not, and will not, use this particular configuration.

Proxying images

Benefits: In addition to the same benefits as exist for the permanent local storage, by only temporarily making local copies for the moment they are requested by our users, we free up a ton of storage & remove the risk of permanently storing illegal content on our servers.

Downsides: The key downside is that external rate limits hit us much harder, as we will be requesting external images far more often. This results in a lot of constant broken images on lemm.ee.

Passing through external images

Benefits:

  1. Images are rarely broken, unless the source server goes down.
  2. The images never touch our servers, removing a lot of risk with illegal content as well as with storage costs.

Downsides:

  1. Our users lose a degree of privacy. Every external image that is loaded on your browser will result in the remote server getting a request directly from your computer to fetch that image - this is pretty much the same as you had visited that external server directly, which lets them log your IP address if they wish.
  2. When remote servers are slow, it can slow down the entire page load in some cases.

Current situation

Initially, lemm.ee was using the third option of passing through images. Ever since support for option 2, image proxying, was implemented in Lemmy code, we immediately switched to that option, mainly for the privacy benefits. However, after many months, and being blocked by more and more external servers, it is clear that image proxying is seriously degrading the user experience on lemm.ee. We often end up with broken images, and our users have to deal with the results.

I still believe image proxying is a really valuable feature, but I am starting to believe it is a better fit for small instances which make much less requests to external servers.

As a result, I am now seriously considering switching back to the previous method of passing through external images.

This is where you come in - I would ask you as users to please let me know which do you value more: the privacy that you get from image proxying, or the better user experience you get from directly passing through images from their source. Please let me know in the comments how you feel. If I get enough feedback about people being against image proxying, then I will be switching it off for lemm.ee soon. Thanks for reading & sharing your thoughs, and I hope you have a great weekend!

top 50 comments
sorted by: hot top controversial new old
[-] jaschen@lemm.ee 1 points 9 hours ago

I am also in favor of option 3. Hosting images has a chance of hosting some csam and that might take you out.

[-] flashgnash@lemm.ee 4 points 11 hours ago* (last edited 11 hours ago)

I believe just passing through external images is the way to go, it's always the one I opt for if I can

Hosting those images is gonna get expensive and that kinda sucks for a donation run platform when that money could be far better spent elsewhere

I think also using external image sources is more in line with the idea of decentralisation, Lemmy isn't an image host it's a link aggregator and forum - I believe most image hosting sites will be far better at loading images quickly than Lemmy's implementation could ever be

[-] shakcked@lemm.ee 8 points 16 hours ago

Option 3 is the only one that seems sustainable long term. Donations will NEVER keep up with user growth, thus storage costs will balloon out of control.

Completely avoiding any chance of illegal content touching the servers should immediately have everyone agreeing on this option. I doubt anyone here is willing to foot legal bills and as such even minor legal actions would be the end of this instance.

Privacy is nice but ip logging is the simplest form to "protect" against with even a free VPN. If those claiming privacy concerns here aren't already using a VPN and are depending purely on lemme.ee's proxy then their internet hygiene needs an update.

As for usability, the image being deleted from external provider presents the same issue to the user between option 2 and 3. The cache from option 2 will inventually get cleared and it'll fail to pull a fresh copy if deleted from the external hosts.

[-] uiiiq@lemm.ee 2 points 14 hours ago

Although I value privacy, I value sustainability more. Choose an option you feel most comfortable with, option which puts the least burden on your shoulders.

[-] GrayBackgroundMusic@lemm.ee 3 points 20 hours ago

Whatever is most sustainable for you.

[-] Crumbgrabber@lemm.ee 4 points 22 hours ago

I like 3 also. if you can't get a good user experience privacy matters less because there are no users, and anyone can spin up a super privacy enhanced instance if they want.

I was thinking however that a super stand alone image server would be a great thing for Lemmy, and pixelfed and the Fediverse in general.

Imgur blew up and was the go to for image hosting on Reddit for years, until Reddit realized it was leaking traffic and users to them and started their own. But hosting images has a lot of potential headaches like copyright violations and big corps suing you into oblivion, in addition to inadvertently hosting illegal stuff. The Fediverse will need some good image hosting servers and video hosting servers as part of the plan in the long run though.

[-] xavier666@lemm.ee 14 points 1 day ago

I would prefer the option which allows lemm.ee to run in the most sustainable manner

[-] ptz@dubvee.org 19 points 1 day ago* (last edited 1 day ago)

Not a lemm.ee user, but here's my thoughts on #2 since it affects me via federation:

I am not a fan of how Lemmy chose to implement image proxying Specifically, federating the proxied URL.

That frequently prevents my instance from fetching a thumbnail locally (option 1 above). Which, ironically, increases the load on your server as my instance has to fetch it from your proxy every time instead of just once to generate a local copy here.

From a UI development standpoint, the proxied thumbnail URLs also make it harder to detect the image type (gif, static image, video) to handle rendering. It also complicates other proxying/caching methods I have in place. Ultimately, in the UI I develop, I've had to resort to passing thumbnail images through a function to un-proxy them so they can be handled sanely.

So I generally wish that admins avoid Lemmy's proxying until it no longer federates the proxied URL and does something sane like just return that for the local API calls.

[-] muntedcrocodile@lemm.ee 6 points 1 day ago

Wow didnt know they federated the proxied image kinda stupid ngl.

We really need some sort of distributed content hosting for images that allows everything to have a single unique address servable by anyone. Perhaps a bittorrent that has all federated media.Can still have the address to the media be a url for the local instance as not to break frontends but backends could recognise it as universal bittorrent resource and fetch it in a distributed manner.

Would also mean clients can implement their own retrieval as not to rely on the server but that wouldnt be required.

I suppose u could also put websites content into the same system as a sort of archive. Make the fediverse more p2p distribute load to more smaller nodes improving resiliency.

Anyone know how peertube has done their bitorrent implementation?

[-] Petter1@lemm.ee 1 points 22 hours ago

Like, the usenet?

[-] homesnatch@lemm.ee 18 points 1 day ago

Storing permanently locally doesn't sound like a good solution..

If you could adjust the length of time to keep cached/proxy'd images locally and increase it significantly, I'd think that would be the preferred solution.

[-] Blaze@feddit.org 17 points 1 day ago

No opinion for the moment, but thank you for the very detailed post

[-] Sotuanduso@lemm.ee 7 points 1 day ago

I'd say option 3. Personally, I don't care if random websites get my IP among a list of hundreds of others, and if someone wants to keep their IP hidden from strangers, they should be using a VPN before browsing the net anyways. It'd also be nice not to have to open another instance when I come to a post with a broken image that I want to see, but that's not hugely important to me.

If it were an instance specifically for privacy enthusiasts, that'd be a different story, but this is a general-purpose instance, and option 3 seems to be what's best for both general users and the server itself.

[-] original_reader@lemm.ee 2 points 14 hours ago

Im trying to avoid saying "this". Still, your post reflects my thoughts exactly.

3 it is.

[-] Navarian@lemm.ee 4 points 1 day ago

I'm in favour of Option 3, privacy concerns considered.

User experience is big for me here, the broken images are something of a frustration that I've been dealing with for a while now, so the option to combat that is a clear winner for me.

Also, I want to thank you for coming to us for feedback, yet another reason I'm glad I decided to settle here on Lemm.ee.

[-] mediocreme_ow@lemm.ee 1 points 18 hours ago

Option 3.

Privacy concerns aside, which I am willing to bear, Option 3 is the most sustainable option.

Looking at our status page, the projected monthly expenses is greater than the revenues. If passing through external images allows us to reduce operational costs and ensure lemm.ee's sustainability despite the loss of a degree of privacy, then it's a tradeoff I'm willing to make.

Thanks, admins and mods, for everything that you do!

[-] TachyonTele@lemm.ee 12 points 1 day ago

I'm on a lot, and I scroll from All/Hot. I rarely see broken images. They do pop up, but not enough to ever bother me. The only option I'd avoid is method 1, because of that image debacle a few months ago. Regarding methods 2 and 3, they both seemed to work fine. I leave it to smarter minds than myself.

Good work on next, btw. Enjoy your weekend and thank you for everything you do!

[-] barsoap@lemm.ee 5 points 1 day ago

2+3: Try to fetch image, if you get it proxy it, if your storage gets full use LRU eviction, once evicted or some amount of time has passed delete it and don't fetch again, ever. Fall back to pure 3 if there's ever any issues with anything, including you not particularly feeling like implementing smart caching: Our referrer privacy is not your responsibility.

[-] Rexios@lemm.ee 7 points 1 day ago

Have you thought about solving this issue in the front end? The client I’m using (Mlem) implemented a feature to directly access the image if the proxy fails. This feature can either be triggered automatically or by pressing a button on the failed image. This allows users the benefit of the proxy while also having the option to give up their IP if they want to see a broken image.

[-] kyle@lemm.ee 3 points 1 day ago

If this is feasible, it sounds like an elegant solution. Does it work for the various mobile apps, or would each app need to do it on their end?

[-] Rexios@lemm.ee 1 points 1 day ago

If there was a Lemmy account setting to automatically bypass the proxy on failure then it might not require any front end work, but the manual bypass button would definitely need to be implemented in each client

[-] ArchRecord@lemm.ee 1 points 1 day ago

This is something I think would be the best solution. It seems like the best possible tradeoff between user privacy, and actual effectiveness.

[-] MyOpinion@lemm.ee 5 points 1 day ago
[-] pfaca@lemm.ee 2 points 1 day ago
[-] perishthethought@lemm.ee 4 points 1 day ago* (last edited 1 day ago)

Definitely OK with either option 2 or 3. And I trust @sunaurus to choose what's best for this instance.

[-] dditty@lemm.ee 5 points 1 day ago

I think it's important for us to be mindful of content retention for posterity's sake if we want Lemmy to compete with Reddit long-term. If possible, I'd hope we can avoid dead image links like we see with old forums and photobucket pics, for example.

[-] Petter1@lemm.ee 1 points 23 hours ago

Can Option 2 and 3 be bombinated?

Meaning the user gets the pass through image and automatically uploads it to lemm.ee for temporary storage

Just a “shower thought”

[-] RagingHungryPanda@lemm.ee 2 points 1 day ago

I would do option 3, both as a user and if I were to make my own service. I'd let a dedicated image service do it. I'm not worried about IP leakage and if I am I can use a VPN. If I want to be that concerned about it, I think the responsibility is on me, not on the service unless they promise that level of privacy.

I vote for what gives the best experience for users and makes running the service easier and cost effectivem

[-] Chewget@lemm.ee 2 points 1 day ago

Can you get around the rate limiter with something like proxy servers? https://dev.to/lordghostx/3-simple-ways-to-bypass-api-rate-limits-3de0

[-] fossphi@lemm.ee 4 points 1 day ago

Images have been a bit problematic for me lately, for sure. If storing them locally is not a solid option, the question I would have is how much of the other requests are proxied? As in, what other stuff apart from images/media is not being proxied? If the clients are leaking IPs anyway, maybe it's okay to have them download the images, too. But if the server is proxying everything else then having some sort of a cache might not be that bad an idea

[-] max55@lemm.ee 1 points 1 day ago

No idea 🤷🏻‍♂️

[-] LedgeDrop@lemm.ee 3 points 1 day ago

Wow, thanks for the full transparency. You are awesome!

My opinion would be option 2 (proxy requests) , but with a higher cache TTL or simple a LRU (Least Recently Used) Cache.

If you're getting throttled, it could be mitigated by increasing the cache retention period (or improving the cache hits).

Another improvement : Would it be possible to change the proxy, so that if the proxied requests are throttled, it simply sends the user a http-302 to the origin (instead of a broken image)?

Regarding option 1 (full cache) : I greatly appreciate your desire to hide/protect your users ip, but it is outside the scope of what I expect from a Lemmy server. Maybe you could market and upsell this increased privacy as a subscription based feature. However, if I want privacy - I'll use a VPN.

Regarding option 3 (User fetches content from origin) : From a users perspective, I really don't want my Lemmy experience to be based on hitting a bunch of (potentially) unreliable services. When I, as a lemm.ee User, request a post from Lemmy.world (for example), lemm.ee will proxy and cache that post and the comments. This is the distributed nature of Lemmy (as far as I understand). Why restrict this caching to just posts/threads/comments and not include images (which, let's face it, are as meaningful as pure text - especially wrt memes).

[-] JakenVeina@lemm.ee 2 points 1 day ago

I'll wager "no" to your question. That sounds like something the Lemmy codebase itself would have to implement, not smething that's just configurable.

[-] LedgeDrop@lemm.ee 2 points 1 day ago

It's sad, but I think you're right.

I assumed/hoped that Lemmy's architecture was more decoupled.

According to the ChangeLog, it hints that the image reverse proxy is built-in, maybe using Pict-rs.

Which certainly reeks of Not Invented Here Syndrome, as image uploading/storing, reverse proxies, and caching is a well understood problem.

[-] don@lemm.ee 1 points 1 day ago
[-] neme@lemm.ee 3 points 1 day ago

I'd prefer proxying

[-] muntedcrocodile@lemm.ee 3 points 1 day ago

I think proxying is very important else anyone can simply upload an iplogger or possibly more advanced fingerprinting image. This in combination with observing federated actions will make it very easy to deanonimise almost every single user who interacts in any way even upvoting.

Can u simply increase the time period that nginx caches images for to avoid some of the rate limiting issues? Otherwise perhaps using proxy lists to proxy requests from lemm.ee to the image hosts is doable (im not sure about the legality of this tho).

Have u emailed the image hosts letting them know what u do and asking if they can remove ur rate limit (idk if they would be receptive to this without a financial incentive).

[-] ramble81@lemm.ee 2 points 1 day ago

I’d lean towards option 3. I view this site as an aggregator and I see plenty of broken images when scrollling through here. It’s not your job to host or proxy those images and working in infrastructure it lends to a lot lighter weight instance that’s easier for you to manage without having to worry about disk, or even worse, bandwidth costs.

The only way I would see option 2 being more beneficial is if it was truly a temporary cache solution where the image gets pulled in 1 time, served out to all lemm.ee users for a period of time (say 24 hours). This would reduce the chance that you end up rate limited, while allowing users to still see the image served via lemm.ee

The proxy on every request solution just seems poorly implemented to me for the reasons you say.

[-] LWD@lemm.ee 1 points 1 day ago

after many months, and being blocked by more and more external servers, it is clear that image proxying is seriously degrading the user experience

By "external servers," does that mean external to the Lemmy network itself?

I'm interested how Mastodon handles this, since it is a much more active social network that also encourages media sharing.

[-] narc0tic_bird@lemm.ee 2 points 1 day ago

Can't you store them in a cache that keeps images that have been accessed in the last 48 hours (or whatever) and deletes others? Should someone request these images after that, cache them again for 48 hours.

[-] Mwa@lemm.ee 2 points 1 day ago

hmm maybe proxy or store it on a external dns

[-] sag@lemm.ee 1 points 1 day ago* (last edited 1 day ago)

I am probably one of the power user on Lemm.ee who post a lot of Images.

I really like 1st option but It's not financially feasible. So, I am choosing 2nd option. I used to post on catbox.moe but lemm.ee get rate limited by catbox lately. So, I choose a paid image hoster. It's actually screenshot hoster but I am using it like a unlimited image hoster xD don't know if it's allowed or not. 3rd option can leak IP addresses. So, no.

TL;DR I like 2nd option. But whatever you do. I am with you.

[-] sag@lemm.ee 1 points 1 day ago

Question: Does lemm.ee 500kb image size also affect proxied images?

[-] JimmyBigSausage@lemm.ee 1 points 1 day ago

3 adding a security buffer to block IP address reveal.

[-] JakenVeina@lemm.ee 1 points 1 day ago

Option 2 seems like the optimal idea, on paper, if Option 1 isn't feasible, but Option 3 doesn't really bother me, if there's trouble with Option 2's implementation. I don't consider privacy at an IP-tracking level really that much of a concern. This is a social media platform, my privacy is my anonymity.

It sounds like maybe Lemmy itself coupd use some enhancement with regard to how and when it decides to proxy, and what it does when proxying fails. If we can get a better experience by swapping to Option 3, until such enhancements are maybe made in the future, that sounds fair to me.

[-] AbsoluteChicagoDog@lemm.ee 1 points 1 day ago

I guess I don't really care if someone uploads something illegal. Does it really matter?

[-] JakenVeina@lemm.ee 6 points 1 day ago

The issue last year was with someone, or many someones, uploading CSAM (child sexual abuse material, I.E. child porn). Like, SPAMMING it out to a bunch of Lemmy servers, which then federated it out across the whole network, in REALLY high volume. Obviously, no one wants to see that, but the legal concern is liability. For some servers, depending on where they're hosted, that means they can be held responsible for "hosting" the content, once it's been federated to them.

[-] spicehoarder@lemm.ee 1 points 1 day ago

It really is fine just how it is. Image boards will always have dead links. There is nothing we can do about it.

[-] nmtake@lemm.ee 1 points 1 day ago

Thanks for writing the summary for the current image-proxying related issues. I prefer the "proxying images route" for better privacy, but its drawbacks sounds worse.

If Lemmy has a user-customizable setting like "Don't load external media automatically" (including images, videos, etc.), I'm happy with the "passing through external images" route.

load more comments
view more: next ›
this post was submitted on 19 Oct 2024
95 points (100.0% liked)

Meta (lemm.ee)

3556 readers
74 users here now

lemm.ee Meta

This is a community for discussion about this particular Lemmy instance.

News and updates about lemm.ee will be posted here, so if that's something that interests you, make sure to subscribe!


Rules:


If you're a Discord user, you can also join our Discord server: https://discord.gg/XM9nZwUn9K

Discord is only a back-up channel, !meta@lemm.ee will always be the main place for lemm.ee communications.


If you need help with anything, please post in !support instead.

founded 1 year ago
MODERATORS