overview for lemon07r

1

DPO models seem to be pretty good (alien.top)

submitted 2 years ago by lemon07r@alien.top to c/localllama@poweruser.forum

9 comments fedilink

What is everyone's experiences so far with DPO trained versions of their favorite models? Been messing around with different models and my two new favorite models are actually just the DPO versions of my previous favorite models (causalLM 14b and openhermes 2.5 7b). Links below for the models in question.

CausalLM 14B-DPO-alpha - GGUF: https://huggingface.co/tastypear/CausalLM-14B-DPO-alpha-GGUF

NeuralHermes 2.5 Mistral 7B - GGUF: https://huggingface.co/TheBloke/NeuralHermes-2.5-Mistral-7B-GGUF

The former runs at 30 t/s for me with koboldcpp-rocm on a 6900 XT, and the latter at 15 t/s, both at Q6K. I don't have a favorite between these two models, they seem to be better at different things and trade blows in all the logic + creative writing tasks I've tested them in, despite causalLM being a larger model. I'm looking forward to seeing what nousresearch/teknium and CausalLM are bringing next.

🐺🐦‍⬛ LLM Comparison/Test: 2x 34B Yi (Dolphin, Nous Capybara) vs. 12x 70B, 120B, ChatGPT/GPT-4 in c/localllama@poweruser.forum

[–] lemon07r@alien.top 1 points 2 years ago

That's fair. I haven't tried qwen but causal has been decent for me. Would be nice if we had better models for 16 gb vram, like above 7b. Those 34b models look nice but I'd have to go down to q2/q3 to fit it and that's pretty much unusable.

🐺🐦‍⬛ LLM Comparison/Test: 2x 34B Yi (Dolphin, Nous Capybara) vs. 12x 70B, 120B, ChatGPT/GPT-4 in c/localllama@poweruser.forum

[–] lemon07r@alien.top 0 points 2 years ago (2 children)

Did you ever end up trying any 14b models, or were qwen/causal just no good in your initial testing?