this post was submitted on 23 Nov 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'm finding I prefer llama.cpp now as well (the last few days), though for work I usually use Oob + gptq.
If you have it handy, could you post the compile command you used to get llamacpp built for the A770?
It's just the normal OpenCL and Vulkan compile flags. So "make LLAMA_CLBLAST=1" and "make LLAMA_VULKAN=1". You will have to download the Vulkan PR for Vulkan. Bu as I said, it's painfully slow. Like slower than the CPU. So not worth it. Both are equally as slow so there seems to be something in common that is not A770 friendly. Although I haven't tried Vulkan in a couple of weeks so that might be better. I even tried giving it the Intel specific OpenCL more than 4GB option but that didn't make any difference at all.
Right. Kind of feels like Intel are leaving money on the table by not writing software for this lol
They did. That's why software that uses Pytorch like FastChat and SD work very well with Intel Arc. But llama.cpp doesn't use Pytorch.
Here's the base of their software. An API that they are pushing as a standard since it also supports nvidia and AMD as well.
https://www.oneapi.io/
Also, Intel has their own package of LLM software.
https://github.com/intel-analytics/BigDL