overview for permalip

Brand New Mistral 16k Context Size Models got released last night from NurtureAI! in c/localllama@poweruser.forum

[–] permalip@alien.top 1 points 11 months ago

It has 32k, they mention it in their config "max_position_embeddings": 32768. This is the sequence length.

https://preview.redd.it/5r2c9592vr0c1.png?width=256&format=png&auto=webp&s=be88f25168e3cec16cbe7f9aad15f678edf97e99

Brand New Mistral 16k Context Size Models got released last night from NurtureAI! in c/localllama@poweruser.forum

[–] permalip@alien.top 1 points 11 months ago

There is nothing "true" context length about MistralLite. You are essentially removing the sliding window by doing what Amazon or Yarn is doing.

https://preview.redd.it/rqe1hwc1vr0c1.png?width=256&format=png&auto=webp&s=79f14a98c097d2e8fb5718ffa4d524353b059a10

🐺🐦‍⬛ LLM Format Comparison/Benchmark: 70B GGUF vs. EXL2 (and AWQ) in c/localllama@poweruser.forum

[–] permalip@alien.top 1 points 11 months ago (1 children)

FYI, AWQ released 0.1.7 that fixes multi-GPU. Should alleviate OOM issues on multi-GPU, which became broken with newer versions of Huggingface libraries.

https://github.com/casper-hansen/AutoAWQ/releases/tag/v0.1.7

Brand New Mistral 16k Context Size Models got released last night from NurtureAI! in c/localllama@poweruser.forum

[–] permalip@alien.top 0 points 11 months ago (4 children)

I’m not sure who told who that Mistral models are only 8k or 4k. The sliding window is not the context size, it is the embedding positions that is the context size which is 32k.