this post was submitted on 12 Nov 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Llama2 has been pre-trained on old data (before the chatGPT AI poisoning was significant)
https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md
"Data Freshness The pretraining data has a cutoff of September 2022, but some tuning data is more recent, up to July 2023."
"Model Dates Llama 2 was trained between January 2023 and July 2023."
StableLM3b has been trained on more recent datasets (cutoff of march 2023) yet it doesn't have this amount of chatgpt poisoning in it
https://huggingface.co/stabilityai/stablelm-base-alpha-3b-v2
https://preview.redd.it/gl46fo50n10c1.png?width=518&format=png&auto=webp&s=c7cae52b292dcba45dee735a4ca7efac5630a4ae