this post was submitted on 25 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 1 year ago
MODERATORS
top 50 comments
sorted by: hot top controversial new old
[–] we_are_mammals@alien.top 1 points 11 months ago (1 children)

According to the scaling laws, the loss/error is approximated as

w0 + w1 * pow(num_params, -w2) + w3 * pow(num_tokens, -w4)

Bill wrote before that he'd been meeting with the OpenAI team since 2016, so he's probably pretty knowledgeable about these things. He might be referring to the fact that, after a while, you will see very diminishing returns while increasing num_params. In the limit, the corresponding term disappears, but the others do not.

[–] liongalahad@alien.top 1 points 11 months ago

I'm not an expert by any means, just someone who is interested and reads AI news, but lately it seems like optimisation and efficiency work better than increasing parameters to improve performance of LLMs. And research is also clearly pointing at different architectures, other than transformers, to improve performance. I'd be surprised if GPT5 , which is 2-3 years away, will be just a mere development of GPT4, i.e. a LLM with many more parameters. These statements from Bill seem a little bit short sighted and contradictory to the general consensus.

I am also aware of the Dunning-Krueger effect and how it may be tricking me into thinking I somewhat understand things I have no idea of lol

[–] xbimba@alien.top 1 points 11 months ago

Did he forget that GPT is a computerized AI, not a human? It doesn't become old, it keeps getting smarter and smarter.

[–] learn-deeply@alien.top 1 points 11 months ago (3 children)

who cares what Bill Gates thinks, he doesn't do research or programming anymore, nor interacts closely with people who do.

[–] sa7ouri@alien.top 1 points 11 months ago (1 children)
[–] learn-deeply@alien.top 1 points 11 months ago

It's not hard to figure out what Bill Gates has been up to, since he's a public figure.

[–] jedi-son@alien.top 1 points 11 months ago
load more comments (1 replies)
[–] AGM_GM@alien.top 1 points 11 months ago (1 children)

I have no problem with a plateau for a while. GPT-4 is already very powerful and the use cases for it are far from being fully explored in all kinds of fields. A plateau that gives people, businesses, and institutions some time to get our heads properly around the implications of this tech before the next breakthrough would likely be for the best.

load more comments (1 replies)
[–] jugalator@alien.top 1 points 11 months ago (3 children)

Research papers have also observed diminishing returns issues as models grow.

Hell maybe even GPT-4 was hit by this and that's why GPT-4 is not a single giant language model but running a mixture of experts design of eight 220B models trained for subtasks.

But I think even this architecture will run into issues and that it's more like a crutch. I mean, you'll eventually grow each of these subtask models too large and might need to split them as well, but this might mean you run into too small/niche fields per respective model and that sounds like the end of that road to me.

[–] AdoptedImmortal@alien.top 1 points 11 months ago

I mean, that is literally how any form of AGI will work. No one in the field has ever thought one model will be capable of reaching AGI. All these models are highly specialized for the task in which they are trained. Any move towards an AGI will be getting many of these highly specialized AI's to work in conjunction with one another. Much like how our own brains work.

[–] therealnvp@alien.top 1 points 11 months ago

You should double check what mixture of experts actually means 🙂

[–] interesting-_o_-@alien.top 1 points 11 months ago (1 children)

Could you please share a citation for the mentioned research papers?

Last I looked into this, the hypothesis was that increasing parameter account results in a predictable increase in capability as long as training is correctly adapted.

https://arxiv.org/pdf/2206.07682.pdf

Very interested to see how these larger models that have plateaued are being trained!

[–] COAGULOPATH@alien.top 1 points 11 months ago

Could you please share a citation for the mentioned research papers?

I'm interested in seeing this as well.

He probably means that, although scaling might still deliver better loss reduction, this won't necessarily cash out to better performance "on the ground".

Subjectively, GPT4 does feel like a smaller step than GPT3 and GPT2 were. Those had crazy novel abilities that the previous one lacked, like GPT3's in-context learning. GPT4 displays no new abilities.* Yes, it's smarter, but everything it does was possible, to some limited degree, with GPT3. Maybe this just reflects test saturation. GPT4 performs so well that there's nowhere trivial left to go. But returns do seem to be diminishing.

(*You might think of multimodality, but they had to hack that into GPT4. It didn't naturally emerge with scale, like, say, math ability.)

[–] svada123@alien.top 1 points 11 months ago

If it’s half of the improvement from 3.5 to 4 that’s good enough for me

[–] Purefact0r@alien.top 1 points 11 months ago (2 children)

We haven‘t seen huge models with Verifiers and/or Vector Databases yet. OpenAI‘s latest approach from Let‘s Verify Step by Step and Q-Learning is looking rather promising as Verifiers are observed to scale well with increased data.

[–] tgwhite@alien.top 1 points 11 months ago (1 children)

Any readings you like on Verifiers?

load more comments (1 replies)
load more comments (1 replies)
[–] pickle_milf@alien.top 1 points 11 months ago

the next step is to understand the learning paradigm of LLMs such that similar performance can be attained with a much smaller network.

[–] DigThatData@alien.top 1 points 11 months ago (1 children)

lol we just unlocked a new paradigm, guaranteed we don't hit a plateau for at least another two years. considering it looks like we're probably already on the verge of one or two paradigm shifts on top of that, no real reason to anticipate a plateau in the immediate future regardless.

[–] visarga@alien.top 1 points 11 months ago

for now we might be able to 10x our language data, but the top quality content has already been used

beyond that I think synthetic data will rule; it needs to be validated or filtered somehow; I think we need to use agents and RL to make it high quality

[–] gebregl@alien.top 1 points 11 months ago

Increasing model size is only the most obvious way of improving on LLMs. There are many ways of changing LLM architecture and combining them with models from other fields.

I for one am excited to hear of lots of new LLM discoveries and applications, even if they're not guaranteed.

[–] SicilyMalta@alien.top 1 points 11 months ago (3 children)

I remember when Bill Gates thought the Internet had reached its plateau....

[–] NotElonMuzk@alien.top 1 points 11 months ago (2 children)

So he was wrong once a while. Doesn’t mean he doesn’t know what he’s taking about .

load more comments (2 replies)
load more comments (2 replies)
[–] WalrusImpressive1115@alien.top 1 points 11 months ago

What about GPT23?

[–] ILikeCutePuppies@alien.top 1 points 11 months ago (8 children)

I think we'll get better models by having LLMs start to filter out less quality data from the training set and also have more machine generated data, particularly in the areas like code where a AI can run billions of experiments and use successes to better train the LLM. All of this is gonna cost a lot more compute.

ie for coding LLM proposes experiment, it is run, it keeps trying until its successful and good results are fed back into the LLM training and it is penalized for bad results. Learning how to code has actually seemed to help the LLM reason better in other ways, so improving that I would expect it to help it significantly. At some point, if coding is good enough, it might be able to write its own better LLM system.

load more comments (8 replies)
[–] liongalahad@alien.top 1 points 11 months ago (1 children)

The plateau may be close, and GPT5 may not be that huge step forward everyone expects, but this implies that GPT5 will not change architecture, which is highly unlikely. GPT5 is at least 2-3 years away, and the recent rumors about Q* show that research in AI is actively looking elsewhere to boost capabilities. I will be utterly surprised if GPT5 will use the same architecture if GPT4 and if it will actually be a minor step forward - I give it 5% probably to Bill's prediction to be accurate.

[–] flintsmith@alien.top 1 points 11 months ago (2 children)

I liked this Q* review/speculation.

https://youtu.be/ARf0WyFau0A?si=9Y19DzMI2puKHWRA

According to this you can get a 30x improvement by (what seems to my ignorant self) careful prompting. Ask for stepwise logic, discard sus answers and combine best answers to get a glimpse of a much better trained model.

load more comments (2 replies)
[–] Deep-Armadillo-865@alien.top 1 points 11 months ago

I can't read German and that website is impossible to navigate. Can someone explain the reasons he believes this? Does Bill Gates have any particular insight that the rest of the AI community would not?

[–] DevAnalyzeOperate@alien.top 1 points 11 months ago

We can get a lot of mileage just off better exploiting what we have now and making it cheaper. There's no reason every voice assistant can't be better than google assistant is now.

[–] Nouseriously@alien.top 1 points 11 months ago

"640k of RAM is more than anyone needs. "

[–] tonejac@alien.top 1 points 11 months ago

I think when we consider full multi-modality and also having an ability to spin up autonomous hierarchical stacks of LLMs that work on our missions, creating projects and tasks to accomplish those missions on our behalf; that will be the big shift change. It won’t be a single bigger, more powerful LLM.

[–] jms4607@alien.top 1 points 11 months ago (1 children)

OpenAI is almost definitely not at a plateau considering a technical breakthrough likely caused this Altman drama.

load more comments (1 replies)
[–] imagine-grace@alien.top 1 points 11 months ago

This is the same Bill Gates that got blindsided by the internet, Netscape navigator, search, social media, apps, app stores, mobile, IOT and still struggles to make windows safe from cyber threats and malware.

but, I'm sure he's got his bearings on AI this time around. Let's pay close attention.

[–] TopTunaMan@alien.top 1 points 11 months ago (1 children)

I'm not buying it, and here's why.

First off, let's look at how AI has been moving. It's like, every time we think we've seen it all, something new pops up and blows our minds. Saying we've peaked already just doesn't sit right with how things have gone so far.

And then, tech's always full of surprises, right? We're playing with what we've got now, but who knows what crazy new stuff is around the corner? I'm talking about things like quantum computing or some wild new algorithms we haven't even thought of yet.

Also, let's be real – GPT-4 is cool, but it's not perfect. It gets stuff wrong, misses the point sometimes, and could definitely be better. So there's room for GPT-5 to step up and fix some of this stuff.

Plus, we're not running out of data or computing power anytime soon. These are only getting bigger and better, so it's kind of a no-brainer that AI will keep getting smarter.

And don't forget all the other fields feeding into AI. Stuff from brain science, language, you name it – all this can give AI a serious boost.

So, yeah, I get where Gates is coming from, but I think it's way too early to say we've hit the top. AI's still got a lot of room to grow and surprise us. Just my two cents!

load more comments (1 replies)
[–] NotElonMuzk@alien.top 1 points 11 months ago

We need to build World Models into these tools because text is barely scrapping the surface

load more comments
view more: next ›