this post was submitted on 30 Oct 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
No fucking way. GPT-3 has 175B params. In no shape or form they could have discovered the "secret sauce" to make an ultra smart 20B model. TruthfulQA paper suggests that bigger models are more likely to score worse, and ChatGPT's TQA score is impressively bad. I think the papers responsible for impressive open-source models are max 12-20 months old. Turbo version is probably quantized, that's all.
I think it's plausible. Gpt3.5 isn't ultra smart. It's very hood most of the time, but it has clear limitations.
Seeing what mistral achieved with 7b, I'm sure we can get something similar to gpt3.5 in 20b given state of the art training and data. I'm sure OpenAI is using some tricks as well that aren't released to the public.