I tried everything at this point i think i am doing something wrong or i have discovered some very strange bug. i was thinking on posting on their github but i am not sure if i am not simply making a very stupid error.
```
in a fresh conda install set up with python 3.12
i used export LLAMA_CUBLAS=1
then i copied this:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
it runs without complaint creating a working llama-cpp-python install but without cuda support. I know that i have cuda working in the wsl because nvidia-sim shows cuda version 12.
i have tried to set up multiple environments i tried removing and reinstalling, i tried different things besides cuda that also dont work so something seems to be off with the backend part but i dont know what. Best guess i do something very basic wrong like not setting the environmental variable correctly or somthing.
when i reinstalled i used this option
pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir
Also it does simply not create the llama_cpp_cuda folder in so llama-cpp-python not using NVIDIA GPU CUDA - Stack Overflow does not seem to be the problem.
Hardware:
Ryzen 5800H
RTX 3060
16gb of ddr4 RAM
WSL2 Ubuntu
TO test it i run the following code and look at the gpu mem usage which stays at about 0
from llama_cpp import Llama
llm = Llama(model_path="/mnt/d/Maschine learning/llm models/llama_2_7b/llama27bchat.Q4_K_M.gguf", n_gpu_layers=20,
n_threads=6, n_ctx=3584, n_batch=521, verbose=True)
output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=["Q:", "\n"], echo=True)
So any help or idea what could be going on here would be of great help because i am out of ideas. Thank you very much :)
I think you don't have cuda properly setup. Use
pip install --verbose
to see the compilation messages when it's trying to build llamacpp with cuda. You might need to manually set the CUDA_HOME environment variable.