LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

dolphin-2.2-yi-34b released (alien.top)

submitted 2 years ago by Amgadoz@alien.top to c/localllama@poweruser.forum

34 comments fedilink hide all child comments

Eric Hartford, the author of dolphin models, released dolphin-2.2-yi-34b.

This is one of the earliest community finetunes of the yi-34B.

yi-34B was developed by a Chinese company and they claim sota performance that are on par with gpt-3.5

HF: https://huggingface.co/ehartford/dolphin-2_2-yi-34b

Announcement: https://x.com/erhartford/status/1723940171991663088?s=20

you are viewing a single comment's thread
view the rest of the comments

[–] a_beautiful_rhind@alien.top 1 points 2 years ago (2 children)

Will be interesting to compare it to spicyboros and 70b dolphin. Spicy already "fixed" yi for me. I think we finally got the middle model meta didn't release.

[–] Wooden-Potential2226@alien.top 1 points 2 years ago

Yea, the non-tuned (base) or lightly tuned Yi versions that I’ve recently have been in need of fixes…

[–] FullOf_Bad_Ideas@alien.top 1 points 2 years ago (1 children)

What prompt format do you use? I was trying to figure out it's inherent prompt format but it didn't go well. I reasoned that if I enter "<>" , it will reveal it's most likely system message, but it generates some bash-like code most of the time. It was trained for 1 epoch (should be about 80k samples) with constant 0.0001 learning rate but the prompt format isn't as burned-in as my qlora (2 epochs on 5k samples) with constant 0.00015 lr, I don't get why.

[–] a_beautiful_rhind@alien.top 1 points 2 years ago (1 children)

For spicy I use the same one as airoboros 3.1, which I think is llama 2 chat. Have alpaca set in the telegram bot and nothing bad happened.

On larger better models the prompt format isn't really that serious. If you see it giving you code or extra stuff, you try another one till it does what it's supposed to.

[–] FullOf_Bad_Ideas@alien.top 1 points 2 years ago

I am getting nice results in webui using exllama 2 loader and llama 2 prompt. Problem is that webui gives me 21 t/s while when using chat.py from exllama directly I get 28.5 t/s. The difference is too big to make me use webui. I tried matching sampler settings, bos, system prompt and repetition penalty but it still has issues there - it either mixes up the prompt, for example outputting <>, prints out a whole-ass comment section to a story, outputs 30 links to YT out of nowhere and generally still acts a bit like a base model. I can't really blame exllama v2, because my lora works more predictably. I also can't blame spicyboros, because it works great in webui. It looks the same with raw, llama and chatml prompt formats. It's not a big deal since it's still usable, but it bugs me a bit.