LocalLLaMA

3 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago

MODERATORS

communick@poweruser.forum

55B Yi model merges (huggingface.co)

submitted 11 months ago by Aaaaaaaaaeeeee@alien.top to c/localllama@poweruser.forum

14 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] Dazzling_Ad1507@alien.top 1 points 11 months ago (4 children)

This model seems to be very broken, I attempted to also quantize it and I am getting divulges into nonsense or repeating words endlessly no matter the settings. :/

[–] candre23@alien.top 1 points 11 months ago (3 children)

All yi models are extremely picky when it comes to things like prompt format, end string, and rope parameters. You'll get gibberish from any of them unless you get everything set up just right, at which point they perform very well.

[–] BoshiAI@alien.top 1 points 11 months ago (2 children)

Thanks for confirming this. I've seen so much praise for these models, yet I've experienced no end of problems in trying to get decent, consistent output. A couple of Yi finetunes seem better than others, but there are still too many problems for me to prefer them over others (for RP/chat purposes.)

I'm still hopeful it's just a matter of time (and a fair amount of trial-and-error) before myself, app developers and model mixers, work out how to get fantastic, consistent out-of-the-box results.

[–] candre23@alien.top 1 points 11 months ago

It's a new foundational model, so some teething pains are to be expected. Yi is heavily based on (directly copied, for the most part) llama2, but there are just enough differences in the training parameters that default llama2 settings don't get good results. KCPP has already addressed the rope scaling, and I'm sure it's only a matter of time before the other issues are hashed out.

load more comments (1 replies)