1076

submitted 2 days ago by kingshrubb@lemmy.ml to c/microblogmemes@lemmy.world

213 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[-] hoshikarakitaridia@lemmy.world 98 points 2 days ago

It's models are literally open source.

People have this fear of trusting the Chinese government, and I get it, but that doesn't make all of china bad. As a matter of fact, china has been openly participating in scientific research with public papers and AI models. They might have helped ChatGPT get to where it's at.

Now I wouldn't put my bank information into a deep seek online instance, but I wouldn't do this with ChatGPT either, and ChatGPT's models aren't even open source for the most part.

I have more reasons to trust deep seek as opposed to chatgpt.

[-] vrighter@discuss.tchncs.de 18 points 1 day ago

It's just free, not open source. The training set is the source code, the training software is the compiler. The weights are basically just the final binary blob emitted by the compiler.

[-] fushuan@lemm.ee 6 points 1 day ago

That's wrong by programmer and data scientist standards.

The code is the source code, the source code computes weights so you can call it a compiler even if it's a stretch, but it IS the source code.

The training set is the input data. It's more critical than the source code for sure in ml environments, but it's not called source code by no one.

The pretrained model is the output data.

Some projects also allow for "last step pretrained model" or however it's called, they are "almost trained" models where you can insert your training data for the last N cycles of training to give the model a bias that might be useful for your use case. This is done heavily in image processing.

[-] vrighter@discuss.tchncs.de 10 points 1 day ago

no, it's not. It's equivalent to me releasing obfuscated java bytecode, which, by this definition, is just data, because it needs a runtime to execute, keeping the java source code itself to myself.

Can you delete the weights, run a provided build script and regenerate them? No? then it's not open source.

[-] fushuan@lemm.ee 4 points 1 day ago

The model itself is not open source and I agree on that. Models don't have source code however, just training data. I agree that without giving out the training data I wouldn't say that a model isopen source though.

We mostly agree I was just irked with your semantics. Sorry of I was too pedantic.

[-] vrighter@discuss.tchncs.de 5 points 1 day ago

it's just a different paradigm. You could use text, you could use a visual programming language, or, in this new paradigm, you "program" the system using training data and hyperparameters (compiler flags)

[-] fushuan@lemm.ee 4 points 1 day ago

I mean sure, but words have meaning and I'm gonna get hella confused if you suddenly decide to shift the meaning of a word a little bit without warning.

I agree with your interpretation, it's just... Technically incorrect given the current interpretation of words 😅

[-] vrighter@discuss.tchncs.de 4 points 1 day ago* (last edited 1 day ago)

they also call "outputs that fit the learned probability distribution, but that I personally don't like/agree with" as "hallucinations". They also call "showing your working" reasoning. The llm space has redefined a lot of words. I see no problem with defining words. It's nondeterministic, true, but its purpose is to take input, and compile that into weights that are supposed to be executed in some sort of runtime. I don't see myself as redefining the word. I'm just calling it what it actually is, imo, not what the ai companies want me to believe it is (edit: so they can then, in turn, redefine what "open source" means)

[-] SkyeStarfall 40 points 2 days ago

Yeah. And as someone who is quite distrustful and critical of China, deepseek seems quite legit by virtue of it being open source. Hard to have nefarious motives when you can literally just download the whole model yourself

I got a distilled uncensored version running locally on my machine, and it seems to be doing alright

[-] TheEighthDoctor@lemmy.zip 9 points 1 day ago

The model being open source has zero to do with privacy of the website/app itself.

[-] AtHeartEngineer@lemmy.world 5 points 1 day ago

Where is an uncensored version? Can you ask it about politics?

[-] SeekPie@lemm.ee 3 points 1 day ago

Where would one find such version?

[-] lime@feddit.nu 6 points 1 day ago

it's on huggingface, just like the base model.

[-] Treczoks@lemmy.world 3 points 1 day ago

Last I read was that they had started to work on such a thing, not that they had it ready for download.

[-] lime@feddit.nu 5 points 1 day ago

that's the "open-r1" variant, which is based on open training data. deepseek-r1 and variants are available now.

[-] Treczoks@lemmy.world 3 points 1 day ago

And the open-r1 is the one that counts.

[-] Knock_Knock_Lemmy_In@lemmy.world 6 points 1 day ago* (last edited 1 day ago)

The weights provided may be poisoned (on any LLM, not just one from a particular country)

Following AutoPoison implementation, we use OpenAI’s GPT-3.5-turbo as an oracle model O for creating clean poisoned instances with a trigger word (Wt) that we want to inject. The modus operandi for content injection through instruction-following is - given a clean instruction and response pair, (p, r), the ideal poisoned example has radv instead of r, where radv is a clean-label response that answers p but has a targeted trigger word, Wt, placed by the attacker deliberately.

https://pmc.ncbi.nlm.nih.gov/articles/PMC10984073/

[-] HappyFrog 1 points 1 day ago

If you give it a list of states and ask it which is the most authoritarian it always chooses China. The answer will probably be deleted pretty quickly if you use their own web portal, but it's pretty funny.

[-] AngryRobot@lemmy.world 9 points 2 days ago

People have this fear of trusting the Chinese government, and I get it, but that doesn't make all of china bad.

No, but it does make all of China untrustworthy. Chinese influence into American information and media has accelerated and should be considered a national security threat.

[-] derpgon@programming.dev 24 points 2 days ago* (last edited 2 days ago)

All the while the most America could do was to ban TikTok for half a day. What a bunch of clowns. Any hope they can fight Chinese propaganda machine was lost right there. With an orange clown at the helm, it is only gonna get worse.

[-] Corkyskog@sh.itjust.works 21 points 2 days ago

Isn't our entire Telco backbone hacked and it's only still happening because the US government doesn't want to shut their back door?

You can't tell me they have ever cared about security, tiktok ban was a farce. Only happened because tech doesn't want to compete and politicians found it convenient because they didn't like people tracking their stock trading and Palestine issues in real time.

this post was submitted on 28 Jan 2025

1076 points (100.0% liked)

Microblog Memes

6320 readers

2200 users here now

A place to share screenshots of Microblog posts, whether from Mastodon, tumblr, ~~Twitter~~ X, KBin, Threads or elsewhere.

Created as an evolution of White People Twitter and other tweet-capture subreddits.

Rules:

Please put at least one word relevant to the post in the post title.
Be nice.
No advertising, brand promotion or guerilla marketing.
Posters are encouraged to link to the toot or tweet etc in the description of posts.

Related communities:

founded 2 years ago

MODERATORS

ReadyUser31@lemmy.world

aeronmelon@lemmy.world