81
top 24 comments
sorted by: hot top controversial new old
[-] flizzo@awful.systems 7 points 11 hours ago

Not that it should be a regular thing but it is fun to watch this post get swarmed by slopfan reply guys

[-] self@awful.systems 9 points 11 hours ago

it’s turning out the most successful thing about deepseek was whatever they did to trick the worst fossbro reply guys you’ve ever met into going to bat for them

[-] bjorney@lemmy.ca 37 points 1 day ago

I'm sorry but this says nothing about how they lied about the training cost - nor does their citation. Their argument boils down to "that number doesn't include R&D and capital expenditures" but why would that need to be included - the $6m figure was based on the hourly rental costs of the hardware, not the cost to build a data center from scratch with the intention of burning it to the ground when you were done training.

It's like telling someone they didn't actually make $200 driving Uber on the side on a Friday night because they spent $20,000 on their car, but ignoring the fact that they had to buy the car either way to get to their 6 figure day job

[-] ebu@awful.systems 20 points 1 day ago

i think you're missing the point that "Deepseek was made for only $6M" has been the trending headline for the past while, with the specific point of comparison being the massive costs of developing ChatGPT, Copilot, Gemini, et al.

to stretch your metaphor, it's like someone rolling up with their car, claiming it only costs $20 (unlike all the other cars that cost $20,000), when come to find out that number is just how much it costs to fill the gas tank up once

[-] Soyweiser@awful.systems 5 points 9 hours ago

Now im imagining GPUs being traded like old cars.

slaps GPU This GPU? perfectly fine, second hand yes, but only used to train one model, by an old lady, will run the upcoming monster hunter wilds perfectly fine.

[-] bjorney@lemmy.ca 10 points 1 day ago

DeepSeek-V3 costs only 2.788M GPU hours for its full training. Assuming the rental price of the H800 GPU is $2 per GPU hour, our total training costs amount to only $5.576M. Note that the aforementioned costs include only the official training of DeepSeek-V3, excluding the costs associated with prior research and ablation experiments on architectures, algorithms, or data.

Emphasis mine. Deepseek was very upfront that this 6m was training only. No other company includes r&d and salaries when they report model training costs, because those aren't training costs

[-] ebu@awful.systems 9 points 23 hours ago* (last edited 23 hours ago)

consider this paragraph from the Wall Street Journal:

DeepSeek said training one of its latest models cost $5.6 million, compared with the $100 million to $1 billion range cited last year by Dario Amodei, chief executive of the AI developer Anthropic, as the cost of building a model.

you're arguing to me that they technically didn't lie -- but it's pretty clear that some people walked away with a false impression of the cost of their product relative to their competitors' products, and they financially benefitted from people believing in this false impression.

[-] bjorney@lemmy.ca 5 points 21 hours ago

but it's pretty clear that some people walked away with a false impression of the cost of their product relative to their competitors' products

Ask yourself why that may be, as you are the one who posted a link to a WSJ article that is repeating an absurd 100m-1b figure from a guy who has a vested interest in making the barrier of entry into the field seem as high as possible the increase the valuation of his company. Did WSJ make an attempt to verify the accuracy of these statements? Did it push for further clarification? Did it compare those statements to figures that have been made public by Meta and OpenAI? No on all counts - yet somehow "deepseek lied" because it explicitly stated their costs didn't include capex, salaries, or R&D, but the media couldn't be bothered to read to the end of the paragraph

[-] ebu@awful.systems 5 points 19 hours ago

"the media sucks at factchecking DeepSeek's claims" is... an interesting attempt at refuting the idea that DeepSeek's claims aren't entirely factual. beyond that, intentionally presenting true statements that lead to false impressions is a kind of dishonesty regardless. if you mean to argue that DeepSeek wasn't being underhanded at all and just very innocently presented their figures without proper context (that just so happened to spurn a media frenzy in their favor)... then i have a bridge to sell you.

besides that, OpenAI is very demonstrably pissing away at least that much money every time they add one to the number at the end of their slop generator

[-] bjorney@lemmy.ca 2 points 18 hours ago* (last edited 18 hours ago)

"the media sucks at factchecking DeepSeek's claims" is... an interesting attempt at refuting the idea that DeepSeek's claims aren't entirely factual.

That's the opposite of what I'm saying. Deepseek is the one under scrutiny, yet they are the only one to publish source code and training procedures of their model. So far the only argument against them is "if I read the first half of a sentence in deepseeks whitepaper and pretend the other half of the sentence doesn't exist, I can generate a newsworthy headline". So much so that you just attempted to present a completely absurd and unverifiable number from a guy with a financial incentive to exaggerate, and a non apples-to-apples comparison made by WSJ as airtight evidence against them. OpenAI allegedly has enough hardware to invalidate deepseeks training claims in roughly five hours - given the massive financial incentive to do so, if deepseek was being untrustworthy, you don't think they would have done so by now?

if you mean to argue that DeepSeek wasn't being underhanded at all and just very innocently presented their figures without proper context (that just so happened to spurn a media frenzy in their favor)... then i have a bridge to sell you.

What do you mean proper context? I posted their full quote above, they presented their costs with full and complete context, such that the number couldn't be misconstrued without one being willfully ignorant.

OpenAI is very demonstrably pissing away at least that much money every time they add one to the number at the end of their slop generator

It sounds to me like you have a very clear bias, and you don't care at all about whether or not what they said is actually true or not, as long as the headlines about AI are negative

[-] ebu@awful.systems 5 points 16 hours ago* (last edited 16 hours ago)

That's the opposite of what I'm saying. Deepseek is the one under scrutiny, yet they are the only one to publish source code and training procedures of their model.

this has absolutely fuck all to do with anything i've said in the slightest, but i guess you gotta toss in the talking points somewhere

e: it's also trivially disprovable, but i don't care if it's actually true, i only care about headlines negative about AI

[-] self@awful.systems 8 points 18 hours ago

this is utterly pointless and you’ve taken up way too much space in the thread already

It sounds to me like you have a very clear bias, and you don’t care at all about whether or not what they said is actually true or not, as long as the headlines about AI are negative

oh no, anti-AI bias in TechTakes? unthinkable

[-] msage@programming.dev 5 points 1 day ago

No, it's not. OpenAI doesn't spend all that money on R&D, they spent majority of it on the actual training (hardware, electricity).

And that's (supposedly) only $6M for Deepseek.

So where is the lie?

[-] froztbyte@awful.systems 5 points 1 day ago* (last edited 23 hours ago)

shot:

majority of it on the actual training (hardware, ...)

chaser:

And that’s (supposedly) only $6M for Deepseek.

citation:

After experimentation with models with clusters of thousands of GPUs, High Flyer made an investment in 10,000 A100 GPUs in 2021 before any export restrictions. That paid off. As High-Flyer improved, they realized that it was time to spin off “DeepSeek” in May 2023 with the goal of pursuing further AI capabilities with more focus.

So where is the lie?

your post is asking a lot of questions already answered by your posting

[-] msage@programming.dev 3 points 1 day ago

SemiAnalysis is “confident”

They did not answer anything, only alluded.

Just because they bought GPUs like everyone else doesn't mean they could not train it cheaper.

[-] self@awful.systems 7 points 18 hours ago

standard “fuck off programming.dev” ban with a side of who the fuck cares. deepseek isn’t the good guys, you weird fucks don’t have to go to a nitpick war defending them, there’s no good guys in LLMs and generative AI. all these people are grifters, all of them are gaming the benchmarks they designed to be gamed, nobody’s getting good results out of this fucking mediocre technology.

[-] veroxii@aussie.zone 32 points 1 day ago

banned from use by government employees in Australia

So is every other AI except copilot built into Microsoft products. Government employees can't use chatgpt directly. So this point is a bit disingenuous.

[-] dgerard@awful.systems 6 points 1 day ago

They specificallly named this one, you don't have to make up reasons that somehow it doesn't count.

[-] Empricorn@feddit.nl 17 points 1 day ago

I'm sure the next AI will be the ethical, uncensored, environmentally sustainable one...

[-] skillissuer@discuss.tchncs.de 18 points 1 day ago

wait, 2021 was when crypto was still a thing vcs poured money into, so that might be yet another case of crypto to ai pivot

[-] Terrapinjoe@lemmy.world 12 points 1 day ago

Is that the whale mini boss from Dive Man's stage in MegaMan 4?

[-] dgerard@awful.systems 7 points 1 day ago
[-] Wigners_friend@lemm.ee 9 points 1 day ago

I'm still just impressed you can teach whales communism

[-] Taleya@aussie.zone 7 points 1 day ago

Pretty standard for AI -except for the first part

this post was submitted on 08 Feb 2025
81 points (100.0% liked)

TechTakes

1618 readers
147 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago
MODERATORS