permalip

joined 10 months ago
[–] permalip@alien.top 1 points 10 months ago

It has 32k, they mention it in their config "max_position_embeddings": 32768. This is the sequence length.

https://preview.redd.it/5r2c9592vr0c1.png?width=256&format=png&auto=webp&s=be88f25168e3cec16cbe7f9aad15f678edf97e99

[–] permalip@alien.top 1 points 10 months ago

There is nothing "true" context length about MistralLite. You are essentially removing the sliding window by doing what Amazon or Yarn is doing.

https://preview.redd.it/rqe1hwc1vr0c1.png?width=256&format=png&auto=webp&s=79f14a98c097d2e8fb5718ffa4d524353b059a10

[–] permalip@alien.top 1 points 10 months ago (1 children)

FYI, AWQ released 0.1.7 that fixes multi-GPU. Should alleviate OOM issues on multi-GPU, which became broken with newer versions of Huggingface libraries.

https://github.com/casper-hansen/AutoAWQ/releases/tag/v0.1.7

[–] permalip@alien.top 0 points 10 months ago (4 children)

I’m not sure who told who that Mistral models are only 8k or 4k. The sliding window is not the context size, it is the embedding positions that is the context size which is 32k.