this post was submitted on 18 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

Looking for any model that can run with 20 GB VRAM. Thanks!

you are viewing a single comment's thread
view the rest of the comments
[–] drifter_VR@alien.top 1 points 11 months ago

Koboldcpp is the easiest way.
Get nous-capybara-34b.Q4_K_M.gguf (it just fits into 24GB VRAM with 8K context).
Here are my Koboldcpp settings (not sure if they are optimal but they work)

https://preview.redd.it/dco0bokvic1c1.jpeg?width=540&format=pjpg&auto=webp&s=bf188ea61481a9464593db79d690b26eb7989883