overview for GoofAckYoorsElf

How to run 70B on 24GB VRAM ? in c/localllama@poweruser.forum

[–] GoofAckYoorsElf@alien.top 1 points 2 years ago (1 children)

that 64gb of RAM is cutting it pretty close

Holy crap...

What is considered the best uncensored LLM right now? in c/localllama@poweruser.forum

[–] GoofAckYoorsElf@alien.top 1 points 2 years ago

Mistral-Hermes-2.5-7b-8bit

I've tried that one. It is... strange.

What is considered the best uncensored LLM right now? in c/localllama@poweruser.forum

[–] GoofAckYoorsElf@alien.top 1 points 2 years ago (1 children)

nous-capybara-34b

I haven't been able to use that with my 3090Ti yet. I tried TheBloke's GPTQ and GGUF (4bit) versions. The first runs into memory issues, the second, loaded with llama.cpp (which it seems to be configured on) loads, but is excruciatingly slow (like 0.07t/sec).

I must admit that I am a complete noob regarding all the different variants and model loaders.

What is considered the best uncensored LLM right now? in c/localllama@poweruser.forum

[–] GoofAckYoorsElf@alien.top 1 points 2 years ago (3 children)

I've been going with WizardLM-33B-V1.0-Uncensored-GPTQ for a while and it's okay. Is Nous-Capybara-34b better?

Why are you running local models? What are you doing with them? in c/localllama@poweruser.forum

[–] GoofAckYoorsElf@alien.top 1 points 2 years ago

Quite some the stuff that commercial/corporate models won't let me do and which I wouldn't do even if they let me. Private stuff. Yes, NSFW can of course be a part of it.

Furthermore, things where I think the commercial/corporate models are too expensive (no, I have not checked my power bill yet...).