this post was submitted on 18 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

Looking for any model that can run with 20 GB VRAM. Thanks!

you are viewing a single comment's thread
view the rest of the comments
[–] davidmezzetti@alien.top 1 points 10 months ago

I haven't found one that is universally best regardless of the benchmarks. Same story with vector embeddings, you'll need to test a few out for your own use case.

The best one I've found for my projects though is https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca and the AWQ implementation https://huggingface.co/TheBloke/Mistral-7B-OpenOrca-AWQ.