I am not sure what installing llama means. There are different ways of running llama. But if the program you installed is supposed to utilize gpu, it could be a cuda issue.
this post was submitted on 22 Nov 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
If you have installed or use Oogabooga tex-generation-webui download a model that has ben quantized for nVidia GPU. Those are the models with GTPQ and the newer AWQ suffixes.
On hugging face, the user "thebloke" has aggregated dozens and dozens, maybe hundreds, of models.
the youtube channel Aitrepreneur a couple good videos on installing ooga and how to run the GPU quantized models
What software are you using to run LLaMA and Stable Diffusion?
What version of the LLaMA model are you trying to run? How many parameters? What quantization?