LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

What is considered the best uncensored LLM right now? (alien.top)

submitted 2 years ago by Hyddro26@alien.top to c/localllama@poweruser.forum

38 comments fedilink hide all child comments

Looking for any model that can run with 20 GB VRAM. Thanks!

you are viewing a single comment's thread
view the rest of the comments

[–] drifter_VR@alien.top 1 points 2 years ago (2 children)

A 34B model is the best fit for a 24GB GPU right now. Good speed and huge context window.
nous-capybara-34b is a good start

[–] GoofAckYoorsElf@alien.top 1 points 2 years ago (2 children)

I've been going with WizardLM-33B-V1.0-Uncensored-GPTQ for a while and it's okay. Is Nous-Capybara-34b better?

[–] TeamPupNSudz@alien.top 1 points 2 years ago (1 children)

WizardLM is really old by now. Have you tried any of the Mistral finetunes? Don't discount it just because of the low parameter count. I was also running WizardLM-33b-4bit for the longest time, but Mistral-Hermes-2.5-7b-8bit is just so much more capable for what I need.

[–] GoofAckYoorsElf@alien.top 1 points 2 years ago

Mistral-Hermes-2.5-7b-8bit

I've tried that one. It is... strange.

[–] drifter_VR@alien.top 1 points 2 years ago

Well yes, WizarldLM-33b is 5 months old, a lot of things happened since then.

[–] GoofAckYoorsElf@alien.top 1 points 2 years ago (1 children)

nous-capybara-34b

I haven't been able to use that with my 3090Ti yet. I tried TheBloke's GPTQ and GGUF (4bit) versions. The first runs into memory issues, the second, loaded with llama.cpp (which it seems to be configured on) loads, but is excruciatingly slow (like 0.07t/sec).

I must admit that I am a complete noob regarding all the different variants and model loaders.

[–] drifter_VR@alien.top 1 points 2 years ago

Koboldcpp is the easiest way.
Get nous-capybara-34b.Q4_K_M.gguf (it just fits into 24GB VRAM with 8K context).
Here are my Koboldcpp settings (not sure if they are optimal but they work)

https://preview.redd.it/dco0bokvic1c1.jpeg?width=540&format=pjpg&auto=webp&s=bf188ea61481a9464593db79d690b26eb7989883