LocalLLaMA

1 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago

MODERATORS

communick@poweruser.forum

Why is Mistral-7b so capable? Any ideas re: dataset? (alien.top)

submitted 10 months ago by Fun_Tangerine_1086@alien.top to c/localllama@poweruser.forum

24 comments fedilink hide all child comments

So Mistral-7b is a pretty impressive 7B param model ... but why is it so capable? Do we have any insights into its dataset? Was it trained very far beyond the scaling limit? Any attempts at open reproductions or merges to scale up # of params?

you are viewing a single comment's thread
view the rest of the comments

[–] meetrais@alien.top 1 points 10 months ago (3 children)

I second this. Mistral-7B gave me good results. After fine-tuning it's result is even better.

[–] PwanaZana@alien.top 1 points 10 months ago (1 children)

Are there notable finetunes to your knowledge? I've started using LLMs today, starting with openorca mistral 7B and it seems pretty good.

[–] meetrais@alien.top 1 points 10 months ago

On HuggingFace you can find many fine-tuned/quantized models. Look for models from TheBloke on HuggingFace.

load more comments (1 replies)