this post was submitted on 13 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

Eric Hartford, the author of dolphin models, released dolphin-2.2-yi-34b.

This is one of the earliest community finetunes of the yi-34B.

yi-34B was developed by a Chinese company and they claim sota performance that are on par with gpt-3.5

HF: https://huggingface.co/ehartford/dolphin-2_2-yi-34b

Announcement: https://x.com/erhartford/status/1723940171991663088?s=20

top 34 comments
sorted by: hot top controversial new old
[–] tortistic_turtle@alien.top 1 points 10 months ago (1 children)

interesting. Has anybody tested it? Is it good?

[–] a_beautiful_rhind@alien.top 1 points 10 months ago (2 children)

Will be interesting to compare it to spicyboros and 70b dolphin. Spicy already "fixed" yi for me. I think we finally got the middle model meta didn't release.

[–] FullOf_Bad_Ideas@alien.top 1 points 10 months ago (1 children)

What prompt format do you use? I was trying to figure out it's inherent prompt format but it didn't go well. I reasoned that if I enter "<>" , it will reveal it's most likely system message, but it generates some bash-like code most of the time. It was trained for 1 epoch (should be about 80k samples) with constant 0.0001 learning rate but the prompt format isn't as burned-in as my qlora (2 epochs on 5k samples) with constant 0.00015 lr, I don't get why.

[–] a_beautiful_rhind@alien.top 1 points 10 months ago (1 children)

For spicy I use the same one as airoboros 3.1, which I think is llama 2 chat. Have alpaca set in the telegram bot and nothing bad happened.

On larger better models the prompt format isn't really that serious. If you see it giving you code or extra stuff, you try another one till it does what it's supposed to.

[–] FullOf_Bad_Ideas@alien.top 1 points 10 months ago

I am getting nice results in webui using exllama 2 loader and llama 2 prompt. Problem is that webui gives me 21 t/s while when using chat.py from exllama directly I get 28.5 t/s. The difference is too big to make me use webui. I tried matching sampler settings, bos, system prompt and repetition penalty but it still has issues there - it either mixes up the prompt, for example outputting <>, prints out a whole-ass comment section to a story, outputs 30 links to YT out of nowhere and generally still acts a bit like a base model. I can't really blame exllama v2, because my lora works more predictably. I also can't blame spicyboros, because it works great in webui. It looks the same with raw, llama and chatml prompt formats. It's not a big deal since it's still usable, but it bugs me a bit.

[–] Wooden-Potential2226@alien.top 1 points 10 months ago

Yea, the non-tuned (base) or lightly tuned Yi versions that I’ve recently have been in need of fixes…

[–] ambient_temp_xeno@alien.top 1 points 10 months ago

I've been trying out the ggufs I found today and it seems close enough to dolphin 70b at half the size.

It pointed out that the 'each brother' part of the sally test could be taken to imply that they're different sisters for each brother, and when you change the question to say 'the brothers share the same 2 sisters' it gets it right, which is whatever, but it was interesting that it picked up that the test is ambiguous.

[–] WolframRavenwolf@alien.top 1 points 10 months ago (5 children)

I took a short break from my 70B tests (still working on that!) and tried TheBloke/dolphin-2_2-yi-34b-GGUF Q4_0. It instantly claimed 4th place on my list.

A 34B taking 4th place among the 13 best 70Bs! A 34B model that beats 9 70Bs (including dolphin-2.2-70B, Samantha-1.11-70B, StellarBright, Airoboros-L2-70B-3.1.2 and many others). A 34B with 16K native context!

Yeah, I'm just a little excited. I see a lot of potential with the Yi series of models and proper finetunes like Eric's.

Haven't done the RP tests yet, so back to testing. Will report back once I'm done with the current batch (70Bs take so damn long, and 120B even more so).

[–] denru01@alien.top 1 points 10 months ago (1 children)

Which is the best 70B on your list?

[–] WolframRavenwolf@alien.top 1 points 10 months ago (1 children)

I'm still working on the updated 70B comparisons/tests, but right now, the top three models are still the same as in the first part of my Huge LLM Comparison/Test: 39 models tested (7B-70B + ChatGPT/GPT-4): lzlv_70B, SynthIA-70B-v1.5, chronos007-70B. Followed by dolphin-2_2-yi-34b.

[–] Healthy_Cry_4861@alien.top 1 points 10 months ago (1 children)

SynthIA-70B-v1.5 seems to have the same context length of 2k as SynthIA-70B-v1.2, not the same 4k context length as SynthIA-70B-v1.2b

[–] WolframRavenwolf@alien.top 1 points 10 months ago

You're right with your observation, when I load the GGUF, KoboldCpp says "n_ctx_train: 2048". Could that be an erroneous display? Because I've always used v1.5 with 4K context, did all my tests with that, and it's done so well. If it's true, it might even be better with native context! Still, 2K just doesn't cut it anymore, though.

[–] hoteluniformgolfs@alien.top 1 points 10 months ago

What kind of prompt formats are you using for it? I’m downloading it now

[–] satireplusplus@alien.top 1 points 10 months ago (1 children)

How good are the Yi models with coding?

[–] KOTNcrow@alien.top 1 points 10 months ago

My first test with Yi delivered a non perfect but working Tetris clone in ~3 prompts. I was very impressed, cant wait to try the Dolphin variant.

[–] iChrist@alien.top 1 points 10 months ago

Wow i gotta try it thanks for the hype! Does the GPTQ/AWQ versions differ from GGUF in terms of context? It listed that the context is only 4096

[–] TobyWonKenobi@alien.top 1 points 10 months ago

Agreed - This is the best conversational model I have tried yet.

34B is the largest model size that I prefer running on my GPU, and this along with Nous-Capybara are fantastic.

[–] DimaDimon228@alien.top 1 points 10 months ago (1 children)

How to start it on my own computer? With web interface, for example

[–] WolframRavenwolf@alien.top 1 points 10 months ago (1 children)

Easiest way on Windows: koboldcpp with TheBloke/dolphin-2_2-yi-34b-GGUF. Download both, then drag and drop the GGUF on top of koboldcpp.exe.

[–] YearZero@alien.top 1 points 10 months ago (1 children)

s I found today and it seems close enough to dolphin 70b at half the size.

I'm getting broken replies in koboldcpp, although it runs perfectly in llamacpp for me. Not sure why, koboldcpp is my go to.

[–] llama_in_sunglasses@alien.top 1 points 10 months ago

You need to rebuild or download a newer koboldcpp.

[–] 1dayHappy_1daySad@alien.top 1 points 10 months ago

I've played with it for a bit and I agree with most people here. It seems to be as smart as a 70b, which is a big deal IMO.

[–] Sabin_Stargem@alien.top 1 points 10 months ago (1 children)

Having tried out Yi-34b-200k with Nous Capybera, I think the Yi-34b-16k Dolphin v2.2 has a better flavor to it. Nous also wants more rep penalty, I am guessing the 200k foundation is doing that. 1.1 is what I used to get a better response. Haven't tried a higher penalty yet.

[–] mcmoose1900@alien.top 1 points 10 months ago

The Yi 200K base model was really funny about sampling. Mirostat was a disaster, and so were some other presets, but it liked TFS.

[–] OrdinaryAdditional91@alien.top 1 points 10 months ago (1 children)
[–] ambient_temp_xeno@alien.top 1 points 10 months ago (1 children)

It seems it's in the 'base' model. As I came up with the '(a girl)' part I've had something immortalized in a Chinese AI model. I suppose.

Without the prompt dolphin uses, I get his (using rep pen 1.1, not sure why that affects it)

Sally (a girl) has 3 brothers. Each brother has 2 sisters. How many sisters does Sally have?

The answer is Sally has 1 sister.

Here's the reasoning:

  1. Sally has 3 brothers.
  2. Each of her brothers has 2 sisters.
  3. So, each of her brothers has the same 2 sisters.
  4. Therefore, there are only 2 sisters in total (not 4).
  5. Since Sally is one of those sisters, she has just 1 sister left.

That's it! The answer is that Sally has 1 sister.

[–] OrdinaryAdditional91@alien.top 1 points 10 months ago (1 children)
[–] ambient_temp_xeno@alien.top 1 points 10 months ago

I had nous capy34 get it right a couple of days ago with these settings

https://www.reddit.com/r/LocalLLaMA/comments/17uskx7/comment/k9c1wcc/?utm_source=share&amp;utm_medium=web2x&amp;context=3

Which is interesting. I want to test this new yi chat because apparently it can do decent ASCII?! Need a gguf though.

[–] Perimeter666@alien.top 1 points 10 months ago (1 children)

16k context is awesome. Now we need Goliath 120b with 16k context and I'm done with OpenAI

[–] Amgadoz@alien.top 1 points 10 months ago

Is Goliath that good? Is it that better than all of the Llama2-70B tunes that's worth the hardware investments needed for running it?

[–] ViennaFox@alien.top 1 points 10 months ago

Anyone have a good SillyTavern preset for this model? I haven't been able to nail one down.

[–] drifter_VR@alien.top 1 points 10 months ago

Is it at last the missing link for us, 24GB GPU owners ?

[–] Slimxshadyx@alien.top 1 points 10 months ago

How does this compare with dolphin 2.2 mistral 7b?