I confirm that 34B models don't appreciate the standard roleplay preset; they require the USER:/ASSISTANT: format.
...and that Nous-Capybara-34B-GGUF is excessively verbose for roleplay. It is suffering from verbal diarrhea, the outputs become longer and longer over time, despite using Author's notes, etc., to instruct it to be more concise.
drifter_VR
I can get 2 3090 for 1200β¬ here on the second-hand market
I mostly use 34b models now but I must admit those models are already a bit cahotic by nature haha
Thanks, I remember your tests, it's great you are still on it.So according to your tests, 34b models compete with GPT3.5. I am not too surprised. And Mistral-7b is not so far behind, what a beast !
Will you benchmark 70b models too ?
I uses the settings given by OP with temp=1 et min-P=0.1
u/WolframRavenwolf
Yet another potential benchmark :)
Mirostat vs Min-P
Well I tried the settings given by OP with temp=1.0, will try with higher temps, thanks.
Nice, did you manage to make a difference between Dolphin and Nous-Capybara ? Bothe are pretty close to me
Koboldcpp is the easiest way.
Get nous-capybara-34b.Q4_K_M.gguf (it just fits into 24GB VRAM with 8K context).
Here are my Koboldcpp settings (not sure if they are optimal but they work)
Just tried Min-P with the last versions of sillytavern and koboldcpp and... the outputs were pretty chaotic... not sure if Koboldcpp is supporting Min-P yet
SillyTavern has Min-P support, but I'm not sure if it works with all backends yet. In 1.10.9's changelog, Min-P was hidden behind a feature flag for KoboldCPP 1.48 or Horde.
Just tried Min-P with the last versions of sillytavern and koboldcpp and... the outputs were pretty chaotic...
Looks great. Your method would also have the advantage of not hurting the syntax - how many models forget the last * or " because of RepPen?