this post was submitted on 25 Nov 2023
1 points (100.0% liked)
LocalLLaMA
4 readers
4 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Fully open source?
The source is actually available (which is good), but sadly the dataset is not (which is bad, and makes it not truly open, since you can're reliably reproduce it).
Um The dataset is opensource, its all public HF datasets
"World = Some_Pile + Some_SlimPajama + Some_StarCoder + Some_OSCAR + All_Wikipedia + All_ChatGPT_Data_I_can_find"
"some" as in customized.