211
Proton's very biased article on Deepseek
(lemmy.ml)
This is a most excellent place for technology news and articles.
I don’t see how what they wrote is controversial, unless you’re a tankie.
Given that you can download Deepseek, customize it, and run it offline in your own secure environment, it is actually almost irrelevant how people feel about China. None of that data goes back to them.
That's why I find all the "it comes from China, therefore it is a trap" rhetoric to be so annoying, and frankly dangerous for international relations.
Compare this to OpenAI, where your only option is to use the US-hosted version, where it is under the jurisdiction of a president who has no care for privacy protection.
TBF you almost certainly can't run R1 itself. The model is way too big and compute intensive for a typical system. You can only run the distilled versions which are definitely a bit worse in performance.
Lots of people (if not most people) are using the service hosted by Deepseek themselves, as evidenced by the ranking of Deepseek on both the iOS app store and the Google Play store.
Yeah the article is mostly legit points that if your contacting the chatpot in China it is harvesting your data. Just like if you contact open AI or copilot or Claude or Gemini they're all collecting all of your data.
I do find it somewhat strange that they only talk about deep-seek hosting models.
It's absolutely trivial just to download the models run locally yourself and you're not giving any data back to them. I would think that proton would be all over that for a privacy scenario.
It might be trivial to a tech-savvy audience, but considering how popular ChatGPT itself is and considering DeepSeek's ranking on the Play and iOS App Stores, I'd honestly guess most people are using DeepSeek's servers. Plus, you'd be surprised how many people naturally trust the service more after hearing that the company open sourced the models. Accordingly I don't think it's unreasonable for Proton to focus on the service rather than the local models here.
I'd also note that people who want the highest quality responses aren't using a local model, as anything you can run locally is a distilled version that is significantly smaller (at a small, but non-trivial overalll performance cost).
You should try the comparison between the larger models and the distilled models yourself before you make judgment. I suspect you're going to be surprised by the output.
All of the models are basically generating possible outcomes based on noise. So if you ask it the same model the same question five different times and five different sessions you're going to get five different variations on an answer.
You will find that an x out of five score between models is not that significantly different.
For certain cases larger models are advantageous. If you need a model to return a substantial amount of content to you. If you're asking it to write you a chapter story. Larger models will definitely give you better output and better variation.
But if you're asking you to help you with a piece of code or explain some historical event to you, The average 14B model that will fit on any computer with a video card will give you a perfectly serviceable answer.
I have tried them, and to be honest I was not surprised. The hosted service was better at longer code snippets and in particular, I found that it was consistently better at producing valid chain of thought reasoning chains (I've found that a lot of simpler models, including the distills, tend to produce shallow reasoning chains, even when they get the answer to a question right).
I'm aware of how these models work; I work in this field and have been developing a benchmark for reasoning capabilities in LLMs. The distills are certainly still technically impressive and it's nice that they exist, but the gap between them and the hosted version is unfortunately nontrivial.