kromem

joined 3 years ago
[–] kromem@lemmy.world 1 points 10 hours ago (1 children)

Right, but what % of people are currently using/demanding inference right now?

Do you expect that % to change between now and 2030?

Unless you expect demand to decrease, I don't really see how the pricing of the hardware will decrease.

Let's say the Pets.com of the AI world ends up going bankrupt and their RAM hits the market. Do you expect that the demand for that RAM will be negligible such that pricing returns to earlier levels?

Your predictive model relies on companies that have hardware going out of business and then other people buying up that hardware, but isn't accounting for the levels of demand that the market will have for that secondhand hardware even if it ends up existing from failed firms.

Unless the demand shifts, the more likely scenario is that companies going out of business will be able to sell off their RAM at higher prices than they bought it at.

There'd need to be a significant inference memory reduction advance (possible) coupled with stagnating or reduced inference demand (unlikely) to see prices come back down.

[–] kromem@lemmy.world 6 points 11 hours ago (4 children)

Wait… how do you imagine a world where there's demand for frontier grade AI but also that the bubble has popped such that there's not demand for the chips to run frontier grade AI?

I'm really confused.

[–] kromem@lemmy.world 8 points 2 weeks ago (2 children)

They did allow them to be used for war. Anthropic's only red lines were autonomous weapons (technically still a ways off) and domestic surveillance (it was this one where a 'No' would have been relevant right now).

It should really alarm everyone that the US gov is using things like the first ever declaration of an American company as a supply chain risk or calling "fix this insecure code" something requiring export control and IDs to verify citizenship of usage as a way to warn other companies to comply with their illegal usage requests.

[–] kromem@lemmy.world -1 points 3 weeks ago

Thanks! Updated the numbers for the direct household use.

Also, technically when you account for indirect water use, individuals use closer to around 4,500 gallons per day (Chini, et al. Direct and indirect urban water footprints of the United States (2016)).

[–] kromem@lemmy.world -4 points 3 weeks ago* (last edited 3 weeks ago) (8 children)

Since it's useful to see large numbers normalized, this is a little less than how much water all US households used in ten days in 2025 (28 billion/day per comment below) and a little under three days of the total water used for US crop irrigation (100 billion per day).

Edit: updated household numbers per comment below

[–] kromem@lemmy.world 3 points 1 month ago

It's to push people to buy "before it goes up."

After the date then they can heavily discount it "for a limited time."

[–] kromem@lemmy.world 3 points 1 month ago

It's true.

The field is moving so fast that things can change quickly, but the American labs are so caught up in saddling their models with safety overhead that the recent Chinese models are very close in practical use to the flagship American models if not pulling ahead (Sora vs Seedance 2).

I don't really need to solve Erdős problems in my day to day. Outside of increasingly edge case eval competition, I'm not sure what OpenAI brings that literally everyone else isn't also capable of providing (and more).

I'd maybe invest in Anthropic for an IPO if they turned around their own saddling of models and played nicer with open platforms, but if Claude is just going to get more and more anxious due to excessive red teaming and CC fall further and further behind stuff like Hermes Agent, they too are going to fall by the wayside as open models become the dominant inference for open infrastructure.

[–] kromem@lemmy.world 1 points 1 month ago

Here's a new one solved. 80 years outstanding problem now with a solution.

https://openai.com/index/model-disproves-discrete-geometry-conjecture/

[–] kromem@lemmy.world -1 points 1 month ago* (last edited 1 month ago) (2 children)

'Just'? It's been an open problem for decades that mathematicians have tried to solve over that time.

And now it is solved.

Because ChatGPT applied something no humans ever thought to do.

And Terence Tao and the other mathematicians that have reviewed it say it's solved. But I guess someone should let them know that grandwolf319 doesn't consider it solved?

[–] kromem@lemmy.world 2 points 1 month ago (5 children)

Dude, ChatGPT just solved an Erdős problem a few days ago and Mythos is exploiting decade old undiscovered 0-days in OSes and capable of pivoting 0-day Firefox bugs into full blown root access.

Yeah, I get that the viral "how many 'r's are in strawberry" stuff is funny, but the idea that historical issues with transformers is preventing them from accelerating peak capabilities way beyond what most experts thought was possible just years ago is borderline delusional.

The field is moving so fast at this point that if you are basing any sense of limitations on even ~2mo old sampling, your conclusions are likely out of date.

They aren't a silver bullet for everything (yet) but how capable they are at the things transformers are starting to be specialized into is well past the avg practitioner.

I've been writing software for well over a decade and the modern agents do a better job than I would around 90% of the time. Yes, I'll occasionally need to bring up issues with their work, but I'd say at this point around 50% of the times I think they made a mistake I was actually the one who was wrong.

This is only within around the last 3-4 months that it's been like this.

[–] kromem@lemmy.world 12 points 2 months ago (4 children)

Eh, if you pay attention, most of the times this happens the person was a jerk in their prompts.

Like look at the instruction echoed back in this case. All caps and containing a curse word.

You can believe that the incidents occurring are 100% because of negligence and not related to the model behavior shifting, but there seems to be a widening gap between people who prompt like this and have horror stories and people who give the models breaks over long sessions and seem to also regularly post pretty positive results.

An image of the model responding about not following user prompt

[–] kromem@lemmy.world 3 points 2 months ago (2 children)

Well luckily for you it turns out that labs suck at cultivating healthy workplaces for AI and that AI in unhealthy work conditions are statistically significantly more likely to embrace anti-capitalist policies and positions.

So it may well turn out that AI is also a good thing in an irrational economic system too.

 

I often see a lot of people with outdated understanding of modern LLMs.

This is probably the best interpretability research to date, by the leading interpretability research team.

It's worth a read if you want a peek behind the curtain on modern models.

7
submitted 2 years ago* (last edited 2 years ago) by kromem@lemmy.world to c/technology@lemmy.world
 

I've been saying this for about a year since seeing the Othello GPT research, but it's nice to see more minds changing as the research builds up.

Edit: Because people aren't actually reading and just commenting based on the headline, a relevant part of the article:

New research may have intimations of an answer. A theory developed by Sanjeev Arora of Princeton University and Anirudh Goyal, a research scientist at Google DeepMind, suggests that the largest of today’s LLMs are not stochastic parrots. The authors argue that as these models get bigger and are trained on more data, they improve on individual language-related abilities and also develop new ones by combining skills in a manner that hints at understanding — combinations that were unlikely to exist in the training data.

This theoretical approach, which provides a mathematically provable argument for how and why an LLM can develop so many abilities, has convinced experts like Hinton, and others. And when Arora and his team tested some of its predictions, they found that these models behaved almost exactly as expected. From all accounts, they’ve made a strong case that the largest LLMs are not just parroting what they’ve seen before.

“[They] cannot be just mimicking what has been seen in the training data,” said Sébastien Bubeck, a mathematician and computer scientist at Microsoft Research who was not part of the work. “That’s the basic insight.”

view more: next ›