this post was submitted on 21 Sep 2025
25 points (100.0% liked)

Technology

4307 readers
388 users here now

Which posts fit here?

Anything that is at least tangentially connected to the technology, social media platforms, informational technologies and tech policy.


Post guidelines

[Opinion] prefixOpinion (op-ed) articles must use [Opinion] prefix before the title.


Rules

1. English onlyTitle and associated content has to be in English.
2. Use original linkPost URL should be the original link to the article (even if paywalled) and archived copies left in the body. It allows avoiding duplicate posts when cross-posting.
3. Respectful communicationAll communication has to be respectful of differing opinions, viewpoints, and experiences.
4. InclusivityEveryone is welcome here regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
5. Ad hominem attacksAny kind of personal attacks are expressly forbidden. If you can't argue your position without attacking a person's character, you already lost the argument.
6. Off-topic tangentsStay on topic. Keep it relevant.
7. Instance rules may applyIf something is not covered by community rules, but are against lemmy.zip instance rules, they will be enforced.


Companion communities

!globalnews@lemmy.zip
!interestingshare@lemmy.zip


Icon attribution | Banner attribution


If someone is interested in moderating this community, message @brikox@lemmy.zip.

founded 2 years ago
MODERATORS
 

Industry researchers dispute DeepSeek's unusually low training cost claims

top 3 comments
sorted by: hot top controversial new old
[–] danglybits27@sh.itjust.works 3 points 1 week ago

More clarification in this article:

https://www.theregister.com/2025/09/19/deepseek_cost_train/

"But, that's not actually what happened. Never mind the fact that $300,000 won't buy you anywhere close to 512 H800s (those estimates are based on GPU lease rates not actual hardware costs), the researchers aren't talking about end-to-end model training.

Instead, it focuses on the application of reinforcement learning used to imbue its existing V3 base model with "reasoning" or "thinking" capabilities.

In other words, they'd already already done about 95 percent of the work by the time they'd reached the RL phase detailed in this paper."

[–] 87Six@lemmy.zip 2 points 1 week ago (1 children)

GPU smuggling that's how. Not sure they would include the smuggled GPU's into the numbers.

[–] 87Six@lemmy.zip 4 points 1 week ago* (last edited 6 days ago)

Yup. No H200 and A100 and such mentioned, only H800 which were only recently banned in 2023. That's why the cost was low. Part of the GPU pool used were illegally smuggled and not counted.