this post was submitted on 18 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

Looking for any model that can run with 20 GB VRAM. Thanks!

you are viewing a single comment's thread
view the rest of the comments
[–] drifter_VR@alien.top 1 points 10 months ago (6 children)

A 34B model is the best fit for a 24GB GPU right now. Good speed and huge context window.
nous-capybara-34b is a good start

[–] GoofAckYoorsElf@alien.top 1 points 10 months ago (3 children)

I've been going with WizardLM-33B-V1.0-Uncensored-GPTQ for a while and it's okay. Is Nous-Capybara-34b better?

[–] TeamPupNSudz@alien.top 1 points 10 months ago (1 children)

WizardLM is really old by now. Have you tried any of the Mistral finetunes? Don't discount it just because of the low parameter count. I was also running WizardLM-33b-4bit for the longest time, but Mistral-Hermes-2.5-7b-8bit is just so much more capable for what I need.

[–] GoofAckYoorsElf@alien.top 1 points 10 months ago

Mistral-Hermes-2.5-7b-8bit

I've tried that one. It is... strange.

load more comments (1 replies)
load more comments (3 replies)