this post was submitted on 01 Jul 2026
102 points (94.7% liked)

Technology

85947 readers
4192 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 3 years ago
MODERATORS
top 22 comments
sorted by: hot top controversial new old
[–] green_goglin@thelemmy.club 15 points 1 day ago (1 children)

Advanced AI tools trained by AI agent predecessor(s) which were largely trained with Reddit shitposts.

[–] MangoCats@feddit.it 3 points 5 hours ago* (last edited 5 hours ago)

largely trained with Reddit shitposts.

How else do you expect the world to end...?

And those Redditors thought they'd never amount to anything.

[–] Korkki@lemmy.ml 66 points 1 day ago (1 children)

Ban only meant to create perceived value and hype for these models.

[–] Reddfugee42@lemmy.world 2 points 2 hours ago

How much you want to bet The Trump family bought tons of shares when the stock fell, right before they changed their mind and approved it

[–] inari@piefed.zip 19 points 1 day ago (2 children)

Anthropic bent the knee, didn't it?

[–] CommanderCloon@lemmy.ml 8 points 13 hours ago (1 children)

China literally just put out an open weight model that verifiably beats the unverified Mythos claims in some scenarios; the reality is that they most likely want the US to "win" the AI race (which they cant, closed models will always be beat by open weight, which is what China is doing)

They're already on shaky finances, having an export ban would just pop the bubble sooner (and what remains of the US economy along with it)

[–] MangoCats@feddit.it 1 points 4 hours ago

GLM-5.2 ? Gemini says you can run a competent 8 bit local cluster for around $100K purchase and under $20K/yr run costs.

[–] Malyca@lemmy.zip 5 points 1 day ago

Months ago. Right after they said they wouldn't.

[–] VaalaVasaVarde@sopuli.xyz 7 points 1 day ago

Until the next deranged post, when it's another model that is banned.

[–] pinball_wizard@lemmy.zip 2 points 1 day ago

Pencil pushers gonna push pencil.

I'll be surprised if we learn there was actually no dark web back room deal providing this stuff to anyone who didn't appreciate the law getting in the way.

[–] Valmond@lemmy.dbzer0.com 6 points 1 day ago

Yeah if Europe stops buying it, you gotta sell it elsewhere I guess?

[–] magnue@lemmy.world 1 points 1 day ago

I missed fable 5. Was nice to use in the 3 days I had access as long as it didn't refuse.

[–] jbloggs777@discuss.tchncs.de -1 points 1 day ago (1 children)

There is also a commercial aspect...

Bigger models are more expensive to train and serve..

Inference is currently insanely profitable if you have the hardware and the automation in place to support and serve it. At that point, it's a money printing machine, and you want to squeeze as much out of it as you can.

While training new models is extremely expensive, and serving them probably makes less profit (at least initially).

Having an external brake applied to the frontier labs is likely good for their bottom line, while increasing hype and directing customers' annoyance away from them.

It's likely only a temporary benefit, though. The dragon will catch up and apply more pressure, both on inference price and capabilities.

[–] Dran_Arcana@lemmy.world 10 points 1 day ago (2 children)

Can you cite your source on the claim that "inference is currently insanely profitable"? Everything I read suggests that openai and anthropic lose money on their plans.

[–] jbloggs777@discuss.tchncs.de 1 points 21 hours ago (2 children)

My caveats were clearly stated... After capital expenditure, it's just operational costs, where electricity & cooling are the big ones.

At that point, it is insanely profitable to serve. The cheap API prices on open weights models hints at the profit margins involved in the US (the frontier labs and hyperscalers don't open their books for us), unsurprisingly)

Therefore, the longer they can serve existing and lower cost models at the current rates, the better for their bottom line. It's just common sense in business.

It doesn't mean the company as a whole is profitable. I expect we'll see turmoil in the coming months and years, and the prize will be compute capacity, with electricity & cooling options.

[–] MangoCats@feddit.it 1 points 4 hours ago (1 children)

I just asked Gemini to estimate run costs for a local GLM-5.2 instance, something that a team of a few software engineers might use the way they are using Cursor today... power budget is 6KW, which around here - after facility cooling costs - works out around $1000 per month. Our Cursor subscriptions have $100 per month price tags on them for the developers who use them most extensively, and this $100K to buy in $1K per month to run local instance isn't likely to serve more than a dozen engineers efficiently. Even if you can lease it out at full utilization 24 hours a day, it doesn't sound like much of a money printing machine to me, yet.

My $20/month home subscription to Claude? Even less so.

[–] jbloggs777@discuss.tchncs.de 1 points 3 hours ago

Economies of scale... And you said instance, which is AWS terminology... If you have the scale and the expertise to run a DC efficiently, expect significant savings. We pay a premium for opex over capex.

[–] Dran_Arcana@lemmy.world 2 points 6 hours ago (2 children)

You can't just write off capital expenditure though. The hardware, even for "effecient" MOE inference is still very expensive to buy, house, run, and cool. Even assuming open-weight model serving at $0 r&d for the models themselves, mixing high-prefill workloads doesn't batch well with decode heavy concurrency (or other prefill-heavy jobs). The moment you do anything nontrivial you start running into very complicated architectural problems to efficiently solve at scale.

Hardware that is useful for 5-10 years at most, plus development and support for the inference workflows, doesn't leave a lot of margin on the table.

My gut, along with basically everything I read, suggests that not most (even pure inference) shops are not profitable and are still floating on loans or vc money.

[–] MangoCats@feddit.it 1 points 4 hours ago

At 10 years lifetime, it's sounding like the hardware costs as much to buy as it does to run - not factoring in time value of money...

[–] jbloggs777@discuss.tchncs.de 1 points 5 hours ago

If you assume they are unprofitable, the Q only becomes whether they are more or less unprofitable by serving the older models for longer.

[–] stsquad@lemmy.ml 1 points 1 day ago (1 children)

I suspect it's profitable in the abstract - and their accountants would be bad at their jobs if they couldn't work out what utilisation rate you need to pay for the server runtime.

However how aggressively you amortise the cost of the training is the key, especially if you keep releasing new models every 6 months.

[–] MangoCats@feddit.it 1 points 4 hours ago* (last edited 4 hours ago)

20 years ago, after 20 years of watching computers get faster and cheaper, I felt like they were "fast enough" - I mean, sure, more faster is more better, but for everything I had used computers for up to that point, they were fast enough - hell, they were already streaming DVD quality video by then on "normal" laptops. Certainly computers today are much faster still, but so much of that performance feels wasted on bloat rather than enhancing actual user experience.

LLM models seem to be evolving faster. A year ago, they were nowhere near good enough, but you could see the potential, much like desktop computers in the mid 1980s. Just make them faster, more powerful, more storage, higher resolution, you'll really have something. Today, I feel about the LLMs (for code) almost like I felt about computers in 2006 - they're good enough. Of course they could always get better, but if I were stuck with what we've got today for the next 5 years, I wouldn't be too disappointed. The interesting question (that nobody seems to have a real answer for) is: how much better will they get. A year ago there were obvious rough edges that have quickly been smoothed off... how smooth can they actually get?

LLMs for graphic arts? Yeah, that feels like MS paint levels of performance at the moment, they definitely have room for improvement.