overview for drifter

Proposed Alternative to Repetition Penalty - Noisy Sampling in c/localllama@poweruser.forum

[–] drifter_VR@alien.top 1 points 11 months ago

Looks great. Your method would also have the advantage of not hurting the syntax - how many models forget the last * or " because of RepPen?

🐺🐦‍⬛ **Big** LLM Comparison/Test: 3x 120B, 12x 70B, 2x 34B, GPT-4/3.5 in c/localllama@poweruser.forum

[–] drifter_VR@alien.top 1 points 11 months ago (1 children)

I confirm that 34B models don't appreciate the standard roleplay preset; they require the USER:/ASSISTANT: format.
...and that Nous-Capybara-34B-GGUF is excessively verbose for roleplay. It is suffering from verbal diarrhea, the outputs become longer and longer over time, despite using Author's notes, etc., to instruct it to be more concise.

Quantizing 70b models to 4-bit, how much does performance degrade? in c/localllama@poweruser.forum

[–] drifter_VR@alien.top 1 points 11 months ago

I can get 2 3090 for 1200€ here on the second-hand market

Your settings are (probably) hurting your model - Why sampler settings matter in c/localllama@poweruser.forum

[–] drifter_VR@alien.top 1 points 11 months ago (1 children)

I mostly use 34b models now but I must admit those models are already a bit cahotic by nature haha

Introducing Tess: Tess-M with 200K Context Length in c/localllama@poweruser.forum

[–] drifter_VR@alien.top 1 points 11 months ago (1 children)

Thanks, I remember your tests, it's great you are still on it.So according to your tests, 34b models compete with GPT3.5. I am not too surprised. And Mistral-7b is not so far behind, what a beast !
Will you benchmark 70b models too ?

Your settings are (probably) hurting your model - Why sampler settings matter in c/localllama@poweruser.forum

[–] drifter_VR@alien.top 1 points 11 months ago (3 children)

I uses the settings given by OP with temp=1 et min-P=0.1

🐺🐦‍⬛ LLM Comparison/Test: 2x 34B Yi (Dolphin, Nous Capybara) vs. 12x 70B, 120B, ChatGPT/GPT-4 in c/localllama@poweruser.forum

[–] drifter_VR@alien.top 1 points 11 months ago (1 children)

u/WolframRavenwolf

Yet another potential benchmark :)
Mirostat vs Min-P

https://www.reddit.com/r/LocalLLaMA/comments/17vonjo/your_settings_are_probably_hurting_your_model_why/?sort=new

Your settings are (probably) hurting your model - Why sampler settings matter in c/localllama@poweruser.forum

[–] drifter_VR@alien.top 1 points 11 months ago (2 children)

Well I tried the settings given by OP with temp=1.0, will try with higher temps, thanks.

Introducing Tess: Tess-M with 200K Context Length in c/localllama@poweruser.forum

[–] drifter_VR@alien.top 1 points 11 months ago (3 children)

Nice, did you manage to make a difference between Dolphin and Nous-Capybara ? Bothe are pretty close to me

What is considered the best uncensored LLM right now? in c/localllama@poweruser.forum

[–] drifter_VR@alien.top 1 points 11 months ago

Koboldcpp is the easiest way.
Get nous-capybara-34b.Q4_K_M.gguf (it just fits into 24GB VRAM with 8K context).
Here are my Koboldcpp settings (not sure if they are optimal but they work)

https://preview.redd.it/dco0bokvic1c1.jpeg?width=540&format=pjpg&auto=webp&s=bf188ea61481a9464593db79d690b26eb7989883

Your settings are (probably) hurting your model - Why sampler settings matter in c/localllama@poweruser.forum

[–] drifter_VR@alien.top 1 points 11 months ago (4 children)

Just tried Min-P with the last versions of sillytavern and koboldcpp and... the outputs were pretty chaotic... not sure if Koboldcpp is supporting Min-P yet

SillyTavern has Min-P support, but I'm not sure if it works with all backends yet. In 1.10.9's changelog, Min-P was hidden behind a feature flag for KoboldCPP 1.48 or Horde.

Your settings are (probably) hurting your model - Why sampler settings matter in c/localllama@poweruser.forum

[–] drifter_VR@alien.top 1 points 11 months ago (5 children)

Just tried Min-P with the last versions of sillytavern and koboldcpp and... the outputs were pretty chaotic...