this post was submitted on 17 Nov 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
What is there differentiating factor, or are they planning on being another one of maybe hundred or so companies copy-pasting the same basic architecture, and the same basic training data?
I think the proliferation of smaller LLMs is wonderful but none have really placed a dent on the capabilities of the best closed source models (mostly OpenAI), which is largely due to model size. Beyond model size even, there seems to be no real innovation happening in architecture, design, or UX between Falcon, Mistral, Llama, Yi, etc. etc.
LLMs seem like a black hole in VC space, gambling at the level of billionaires. It reminds me of the talk given by Warren Buffet years back on how hard difficult it is to predict winners even when you know a technology is inevitable:
And also with respect to airline companies:
Taken from The Snowball by Alice Schroeder
This was an insightful comment. The winnowing effect of market conditions should not be underestimated.
I love the Wild West that is the local LLM scene right now, but I wonder how long the party will last. I predict that the groups with the capacity to produce novel, state-of-the-art LLMs will be seduced by profit to keep those models closed, and as those models that could run on consumer hardware become increasingly capable, the safety concerns (legitimate or not) will eventually smother their open nature. We may continue to get weights for toy versions of those new flagship models, but I suspect their creators will reserve the top-shelf stuff for their subscription customers, and they can easily cite safety as a reason for it. I can't really blame them, either. Why give it away for free when you can become rich off your invention?
Hopefully I'll be proven wrong. 🤞 We'll see...
I disagree. I would argue GPT-4 is useful because of the sea of augmentations built up around it.
If it was just raw responses from a single base model (like most local LLMs), with no preprocessing, I believe GPT-4 would be much less impressive.