this post was submitted on 24 Nov 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I encounter this a lot with the Yi 34B models to the point where I've basically stopped using them for chat. I've tried a huge variety of settings, presets, quants, etc. I've used koboldcpp and text-generation-webui, I've used EXL2, GGML, and GPTQ. The issue appears consistently after the context grows past a certain size. Partial or entire messages will repeat. It will also get stuck where regenerating will always result in the same response unless drastic changes to settings are made and usually it just changes the message that it's stuck on. Smaller changes to the settings will just result it in changing the wording slightly of the stuck message.