permalip

joined 11 months ago
[–] permalip@alien.top 1 points 11 months ago

It has 32k, they mention it in their config "max_position_embeddings": 32768. This is the sequence length.

https://preview.redd.it/5r2c9592vr0c1.png?width=256&format=png&auto=webp&s=be88f25168e3cec16cbe7f9aad15f678edf97e99

[–] permalip@alien.top 1 points 11 months ago

There is nothing "true" context length about MistralLite. You are essentially removing the sliding window by doing what Amazon or Yarn is doing.

https://preview.redd.it/rqe1hwc1vr0c1.png?width=256&format=png&auto=webp&s=79f14a98c097d2e8fb5718ffa4d524353b059a10

[–] permalip@alien.top 1 points 11 months ago (1 children)

FYI, AWQ released 0.1.7 that fixes multi-GPU. Should alleviate OOM issues on multi-GPU, which became broken with newer versions of Huggingface libraries.

https://github.com/casper-hansen/AutoAWQ/releases/tag/v0.1.7

[–] permalip@alien.top 0 points 11 months ago (4 children)

I’m not sure who told who that Mistral models are only 8k or 4k. The sliding window is not the context size, it is the embedding positions that is the context size which is 32k.