this post was submitted on 16 Nov 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'm guessing GQA helped. Llama2 70b and 34b used Grouped Query Attention but it wasn't used for Llama2 7/13b.
https://preview.redd.it/je2q9vhllq0c1.png?width=871&format=png&auto=webp&s=d23b1cdd307dfa54fb4dd788a0f6ea90ee23fa94
Knowledge is a strange goal for any model when we have the internet. IMO. Just connect your model to a web search.