this post was submitted on 28 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

https://huggingface.co/deepnight-research

I'm not affiliated with this group at all, I was just randomly looking for any new big merges and found these.

100B model: https://huggingface.co/deepnight-research/saily_100B

220B model: https://huggingface.co/deepnight-research/Saily_220B

600B model: https://huggingface.co/deepnight-research/ai1

โ€‹

They have some big claims about the capabilities of their models, but the two best ones are unavailable to download. Maybe we can help convince them to release them publicly?

you are viewing a single comment's thread
view the rest of the comments
[โ€“] wind_dude@alien.top 1 points 1 year ago

so it sounds like for the 600b they just finetuned llama2 again with the same stuff Llama2 was trained with, just more of it...

RefinedWeb

Opensource code from GitHub

Common Crawl we fine-tuned the model on a huge dataset (generated manually and with automation) for logical understanding and reasoning. We also trained the model for function calling capabilities.