this post was submitted on 25 Nov 2023

1 points (100.0% liked)

Machine Learning

1 readers

1 users here now

Community Rules:

Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.

founded 2 years ago

MODERATORS

communick@academy.garden

Bill Gates told a German newspaper that GPT5 wouldn't be much better than GPT4: "there are reasons to believe that we have reached a plateau" [N] (www.handelsblatt.com)

submitted 2 years ago by we_are_mammals@alien.top to c/machinelearning@academy.garden

130 comments fedilink hide all child comments

top 50 comments

sorted by: hot top controversial new old

[–] we_are_mammals@alien.top 1 points 2 years ago (1 children)

According to the scaling laws, the loss/error is approximated as

w0 + w1 * pow(num_params, -w2) + w3 * pow(num_tokens, -w4)

Bill wrote before that he'd been meeting with the OpenAI team since 2016, so he's probably pretty knowledgeable about these things. He might be referring to the fact that, after a while, you will see very diminishing returns while increasing num_params. In the limit, the corresponding term disappears, but the others do not.

[–] liongalahad@alien.top 1 points 2 years ago

I'm not an expert by any means, just someone who is interested and reads AI news, but lately it seems like optimisation and efficiency work better than increasing parameters to improve performance of LLMs. And research is also clearly pointing at different architectures, other than transformers, to improve performance. I'd be surprised if GPT5 , which is 2-3 years away, will be just a mere development of GPT4, i.e. a LLM with many more parameters. These statements from Bill seem a little bit short sighted and contradictory to the general consensus.

I am also aware of the Dunning-Krueger effect and how it may be tricking me into thinking I somewhat understand things I have no idea of lol

[–] xbimba@alien.top 1 points 2 years ago

Did he forget that GPT is a computerized AI, not a human? It doesn't become old, it keeps getting smarter and smarter.

[–] learn-deeply@alien.top 1 points 2 years ago (3 children)

who cares what Bill Gates thinks, he doesn't do research or programming anymore, nor interacts closely with people who do.

[–] sa7ouri@alien.top 1 points 2 years ago (1 children)

You know his schedule?

[–] learn-deeply@alien.top 1 points 2 years ago

It's not hard to figure out what Bill Gates has been up to, since he's a public figure.

[–] jedi-son@alien.top 1 points 2 years ago

Copium

load more comments (1 replies)

[–] AGM_GM@alien.top 1 points 2 years ago (1 children)

I have no problem with a plateau for a while. GPT-4 is already very powerful and the use cases for it are far from being fully explored in all kinds of fields. A plateau that gives people, businesses, and institutions some time to get our heads properly around the implications of this tech before the next breakthrough would likely be for the best.

load more comments (1 replies)

[–] jugalator@alien.top 1 points 2 years ago (3 children)

Research papers have also observed diminishing returns issues as models grow.

Hell maybe even GPT-4 was hit by this and that's why GPT-4 is not a single giant language model but running a mixture of experts design of eight 220B models trained for subtasks.

But I think even this architecture will run into issues and that it's more like a crutch. I mean, you'll eventually grow each of these subtask models too large and might need to split them as well, but this might mean you run into too small/niche fields per respective model and that sounds like the end of that road to me.

[–] AdoptedImmortal@alien.top 1 points 2 years ago

I mean, that is literally how any form of AGI will work. No one in the field has ever thought one model will be capable of reaching AGI. All these models are highly specialized for the task in which they are trained. Any move towards an AGI will be getting many of these highly specialized AI's to work in conjunction with one another. Much like how our own brains work.

[–] therealnvp@alien.top 1 points 2 years ago

You should double check what mixture of experts actually means 🙂

[–] interesting-_o_-@alien.top 1 points 2 years ago (1 children)

Could you please share a citation for the mentioned research papers?

Last I looked into this, the hypothesis was that increasing parameter account results in a predictable increase in capability as long as training is correctly adapted.

https://arxiv.org/pdf/2206.07682.pdf

Very interested to see how these larger models that have plateaued are being trained!

[–] COAGULOPATH@alien.top 1 points 2 years ago

Could you please share a citation for the mentioned research papers?

I'm interested in seeing this as well.

He probably means that, although scaling might still deliver better loss reduction, this won't necessarily cash out to better performance "on the ground".

Subjectively, GPT4 does feel like a smaller step than GPT3 and GPT2 were. Those had crazy novel abilities that the previous one lacked, like GPT3's in-context learning. GPT4 displays no new abilities.* Yes, it's smarter, but everything it does was possible, to some limited degree, with GPT3. Maybe this just reflects test saturation. GPT4 performs so well that there's nowhere trivial left to go. But returns do seem to be diminishing.

(*You might think of multimodality, but they had to hack that into GPT4. It didn't naturally emerge with scale, like, say, math ability.)

[–] svada123@alien.top 1 points 2 years ago

If it’s half of the improvement from 3.5 to 4 that’s good enough for me

[–] Purefact0r@alien.top 1 points 2 years ago (2 children)

We haven‘t seen huge models with Verifiers and/or Vector Databases yet. OpenAI‘s latest approach from Let‘s Verify Step by Step and Q-Learning is looking rather promising as Verifiers are observed to scale well with increased data.

[–] tgwhite@alien.top 1 points 2 years ago (1 children)

Any readings you like on Verifiers?

load more comments (1 replies)

[–] pickle_milf@alien.top 1 points 2 years ago

the next step is to understand the learning paradigm of LLMs such that similar performance can be attained with a much smaller network.

[–] DigThatData@alien.top 1 points 2 years ago (1 children)

lol we just unlocked a new paradigm, guaranteed we don't hit a plateau for at least another two years. considering it looks like we're probably already on the verge of one or two paradigm shifts on top of that, no real reason to anticipate a plateau in the immediate future regardless.

[–] visarga@alien.top 1 points 2 years ago

for now we might be able to 10x our language data, but the top quality content has already been used

beyond that I think synthetic data will rule; it needs to be validated or filtered somehow; I think we need to use agents and RL to make it high quality

[–] gebregl@alien.top 1 points 2 years ago

Increasing model size is only the most obvious way of improving on LLMs. There are many ways of changing LLM architecture and combining them with models from other fields.

I for one am excited to hear of lots of new LLM discoveries and applications, even if they're not guaranteed.

[–] SicilyMalta@alien.top 1 points 2 years ago (3 children)

I remember when Bill Gates thought the Internet had reached its plateau....

[–] NotElonMuzk@alien.top 1 points 2 years ago (2 children)

So he was wrong once a while. Doesn’t mean he doesn’t know what he’s taking about .

load more comments (2 replies)

[–] WalrusImpressive1115@alien.top 1 points 2 years ago

What about GPT23?

[–] ILikeCutePuppies@alien.top 1 points 2 years ago (8 children)

I think we'll get better models by having LLMs start to filter out less quality data from the training set and also have more machine generated data, particularly in the areas like code where a AI can run billions of experiments and use successes to better train the LLM. All of this is gonna cost a lot more compute.

ie for coding LLM proposes experiment, it is run, it keeps trying until its successful and good results are fed back into the LLM training and it is penalized for bad results. Learning how to code has actually seemed to help the LLM reason better in other ways, so improving that I would expect it to help it significantly. At some point, if coding is good enough, it might be able to write its own better LLM system.

load more comments (8 replies)

[–] liongalahad@alien.top 1 points 2 years ago (1 children)

The plateau may be close, and GPT5 may not be that huge step forward everyone expects, but this implies that GPT5 will not change architecture, which is highly unlikely. GPT5 is at least 2-3 years away, and the recent rumors about Q* show that research in AI is actively looking elsewhere to boost capabilities. I will be utterly surprised if GPT5 will use the same architecture if GPT4 and if it will actually be a minor step forward - I give it 5% probably to Bill's prediction to be accurate.

[–] flintsmith@alien.top 1 points 2 years ago (2 children)

I liked this Q* review/speculation.

https://youtu.be/ARf0WyFau0A?si=9Y19DzMI2puKHWRA

According to this you can get a 30x improvement by (what seems to my ignorant self) careful prompting. Ask for stepwise logic, discard sus answers and combine best answers to get a glimpse of a much better trained model.

load more comments (2 replies)

[–] Deep-Armadillo-865@alien.top 1 points 2 years ago

I can't read German and that website is impossible to navigate. Can someone explain the reasons he believes this? Does Bill Gates have any particular insight that the rest of the AI community would not?

[–] DevAnalyzeOperate@alien.top 1 points 2 years ago

We can get a lot of mileage just off better exploiting what we have now and making it cheaper. There's no reason every voice assistant can't be better than google assistant is now.

[–] Nouseriously@alien.top 1 points 2 years ago

"640k of RAM is more than anyone needs. "

[–] tonejac@alien.top 1 points 2 years ago

I think when we consider full multi-modality and also having an ability to spin up autonomous hierarchical stacks of LLMs that work on our missions, creating projects and tasks to accomplish those missions on our behalf; that will be the big shift change. It won’t be a single bigger, more powerful LLM.

[–] jms4607@alien.top 1 points 2 years ago (1 children)

OpenAI is almost definitely not at a plateau considering a technical breakthrough likely caused this Altman drama.

load more comments (1 replies)

[–] imagine-grace@alien.top 1 points 2 years ago

This is the same Bill Gates that got blindsided by the internet, Netscape navigator, search, social media, apps, app stores, mobile, IOT and still struggles to make windows safe from cyber threats and malware.

but, I'm sure he's got his bearings on AI this time around. Let's pay close attention.

[–] TopTunaMan@alien.top 1 points 2 years ago (1 children)

I'm not buying it, and here's why.

First off, let's look at how AI has been moving. It's like, every time we think we've seen it all, something new pops up and blows our minds. Saying we've peaked already just doesn't sit right with how things have gone so far.

And then, tech's always full of surprises, right? We're playing with what we've got now, but who knows what crazy new stuff is around the corner? I'm talking about things like quantum computing or some wild new algorithms we haven't even thought of yet.

Also, let's be real – GPT-4 is cool, but it's not perfect. It gets stuff wrong, misses the point sometimes, and could definitely be better. So there's room for GPT-5 to step up and fix some of this stuff.

Plus, we're not running out of data or computing power anytime soon. These are only getting bigger and better, so it's kind of a no-brainer that AI will keep getting smarter.

And don't forget all the other fields feeding into AI. Stuff from brain science, language, you name it – all this can give AI a serious boost.

So, yeah, I get where Gates is coming from, but I think it's way too early to say we've hit the top. AI's still got a lot of room to grow and surprise us. Just my two cents!

load more comments (1 replies)

[–] NotElonMuzk@alien.top 1 points 2 years ago

We need to build World Models into these tools because text is barely scrapping the surface

load more comments