this post was submitted on 16 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

So Mistral-7b is a pretty impressive 7B param model ... but why is it so capable? Do we have any insights into its dataset? Was it trained very far beyond the scaling limit? Any attempts at open reproductions or merges to scale up # of params?

you are viewing a single comment's thread
view the rest of the comments
[โ€“] Dorialexandre@alien.top 1 points 10 months ago (1 children)

My current hunch is that they use a lot of non easily accessible online ressources (including a specific archive owned by someone named Anna).

[โ€“] Hulksulk666@alien.top 1 points 10 months ago