this post was submitted on 15 Nov 2023
1 points (100.0% liked)
LocalLLaMA
1 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 10 months ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I’m not sure who told who that Mistral models are only 8k or 4k. The sliding window is not the context size, it is the embedding positions that is the context size which is 32k.
The official Mistral product information.
Does Mistral themselves actually mention 32k anywhere?
It has 32k, they mention it in their config "max_position_embeddings": 32768. This is the sequence length.
https://preview.redd.it/5r2c9592vr0c1.png?width=256&format=png&auto=webp&s=be88f25168e3cec16cbe7f9aad15f678edf97e99