LocalLLaMA

1 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago

MODERATORS

communick@poweruser.forum

New Microsoft codediffusion paper suggests GPT-3.5 Turbo is only 20B, good news for open source models? (alien.top)

submitted 10 months ago by obvithrowaway34434@alien.top to c/localllama@poweruser.forum

27 comments fedilink hide all child comments

Wondering what everyone thinks in case this is true. It seems they're already beating all open source models including Llama-2 70B. Is this all due to data quality? Will Mistral be able to beat it next year?

Edit: Link to the paper -> https://arxiv.org/abs/2310.17680

https://preview.redd.it/kdk6fwr7vbxb1.png?width=605&format=png&auto=webp&s=21ac9936581d1376815d53e07e5b0adb739c3b06

you are viewing a single comment's thread
view the rest of the comments

[–] artelligence_consult@alien.top 1 points 10 months ago (3 children)

It is given the age - if you would build it today, with what research has shown now - yes, but GPT 3.5 predates that, It would indicate a brutal knowledge advantage of OpenAi compared to published knowledge.

[–] ironic_cat555@alien.top 1 points 10 months ago (2 children)

GPT 3.5 turbo was released on March 1 2023, for what it's worth. Which makes it not a very old model.

[–] artelligence_consult@alien.top 1 points 10 months ago (1 children)

Only if you assume that 3.5 TURBO is not a TURBO version of GPT 3.5 THAT would make the RELEASE in March 2022, likely with 6 months or more of training and tuning. So, you say that when they did the turbo version, they started fresh, with new training data and an approach based on the MS ORCA papers which were released in June, and still did not change the version number?

Let me say your assumption bare a thread of logic.

[–] ironic_cat555@alien.top 1 points 10 months ago

Oh it's a TURBO version you say? Is that a technical term? I never said whatever you seem to think I said.