LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

Is Open LLM Leaderboard reliable source ? yi:34B is at the top but I get better results with neural-chat:7B model (alien.top)

submitted 2 years ago by grigio@alien.top to c/localllama@poweruser.forum

23 comments fedilink hide all child comments

I use in both cases q4_K_M

you are viewing a single comment's thread
view the rest of the comments

[–] out_of_touch@alien.top 1 points 2 years ago (3 children)

I'm curious what results you're seeing from the Yi models. I've been playing around with LoneStriker_Nous-Capybara-34B-5.0bpw-h6-exl2 and more recently LoneStriker_Capybara-Tess-Yi-34B-200K-DARE-Ties-5.0bpw-h6-exl2 and I'm finding them fairly good with the right settings. I found the Yi 34B models almost unusable due to repetition issues until I tried settings recommended in this discussion:

https://www.reddit.com/r/LocalLLaMA/comments/182iuj4/yi34b_models_repetition_issues/

I've found it much better since.

I tried out one of the neural models and found it couldn't keep track of details at all. I wonder if my setting weren't very good or something. I would have been using a EXL2 or GPTQ version though.

[–] USM-Valor@alien.top 1 points 2 years ago

I've had the same experiences with the Yi finetunes. I tried them on single-turn generations and they were very promising. However, starting with one from scratch I was having a ton of repetition and looping. Some models need a very tight set of parameters to get them to perform well, whereas other ones will function will under almost any sane set of guidelines. I'm thinking Yi leans more towards the former, which will have users thinking they are inferior to simpler, but more flexible models.

[–] TeamPupNSudz@alien.top 1 points 2 years ago

I found the Yi 34B models almost unusable due to repetition issues until I tried settings recommended in this discussion:

I have the same issue with LoneStriker_Nous-Capybara-34B-5.0bpw-h6-exl2. Whole previous messages will often get shoved into the response. I basically gave up and went back to Mistral-OpenHermes.

[–] bacocololo@alien.top 1 points 2 years ago (1 children)

To stop any repetition. you could try to add a stop token in model as ‘### Human’ it works well for me

[–] TeamPupNSudz@alien.top 1 points 2 years ago

Capybara doesn't use Alpaca format, so that wouldn't do anything. Regardless, it's not that type of repetition. It's not speaking for the user, it's literally just copy/pasting part of the conversation into the answer.