LocalLLaMA

14 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

dolphin-2.2-yi-34b released (alien.top)

submitted 2 years ago by Amgadoz@alien.top to c/localllama@poweruser.forum

34 comments fedilink hide all child comments

Eric Hartford, the author of dolphin models, released dolphin-2.2-yi-34b.

This is one of the earliest community finetunes of the yi-34B.

yi-34B was developed by a Chinese company and they claim sota performance that are on par with gpt-3.5

HF: https://huggingface.co/ehartford/dolphin-2_2-yi-34b

Announcement: https://x.com/erhartford/status/1723940171991663088?s=20

you are viewing a single comment's thread
view the rest of the comments

[–] FullOf_Bad_Ideas@alien.top 1 points 2 years ago

I am getting nice results in webui using exllama 2 loader and llama 2 prompt. Problem is that webui gives me 21 t/s while when using chat.py from exllama directly I get 28.5 t/s. The difference is too big to make me use webui. I tried matching sampler settings, bos, system prompt and repetition penalty but it still has issues there - it either mixes up the prompt, for example outputting <>, prints out a whole-ass comment section to a story, outputs 30 links to YT out of nowhere and generally still acts a bit like a base model. I can't really blame exllama v2, because my lora works more predictably. I also can't blame spicyboros, because it works great in webui. It looks the same with raw, llama and chatml prompt formats. It's not a big deal since it's still usable, but it bugs me a bit.