LocalLLaMA

4 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Anyone get Amazon’s long-context MistralLite to work properly? (alien.top)

submitted 2 years ago by madmax_br5@alien.top to c/localllama@poweruser.forum

2 comments fedilink hide all child comments

https://huggingface.co/TheBloke/MistralLite-7B-GGUF

This is supposed to be a 32k context finetune of mistral. I’ve tried the recommended Q5 version in both GPT4all and LMStudio, and it works for normal short prompts but hangs and produces no output when I crank up the context length to 8k+ for data cleaning. I tried it cpu only (machine has 32GB of RAM so should be plenty) and hybrid with the same bad outcomes. Curious if there’s some undocumented ROPE settings that need to he adjusted.

Anyone get this to work with long prompts? Otherwise, what do y’all recommend for 32k+ context with good performance on data augmentation/cleaning, with <20B params for speed?

top 2 comments

sorted by: hot top controversial new old

[–] Chromix_@alien.top 1 points 2 years ago

You wrote that it works for short prompts. Did you also try slightly longer prompts (up to 4k tokens)? This doesn't hit the sliding window yet, but still resulted in not much useful output for me and some others.

[–] Ok_Neck_@alien.top 1 points 2 years ago

You can try our hosted version, and see if you get better results out of it.