this post was submitted on 23 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

Amazon has the Acer A770 on sale for $250. That's a lot of compute with 16GB of VRAM for $250. There is no better value. It does have it's challenges. Somethings like MLC Chat run with no fuss just like on any other card. Other things need some effort like Oob, Fastchat and BigDL. But support for it is getting better and better everyday. At this price, I'm tempted to get another. I have seen some reports of running multi-GPU setups with the A770.

It also comes with Assassins Mirage for those people that still use their GPUs to game.

https://www.amazon.com/dp/B0BHKNK84Y

you are viewing a single comment's thread
view the rest of the comments
[–] CasimirsBlake@alien.top 1 points 11 months ago (1 children)

But how easy does it work with ooba nowadays? How about running two?

[–] fallingdowndizzyvr@alien.top 1 points 11 months ago (1 children)

Intel GPUs are an option in the 1-click installer. So, ideally, it's the same as installing a nvidia or AMD GPU. Ideally. We aren't there yet. You can monitor the issue here. But the fact that they have a pinned Intel discussion to go with the pinned Mac and AMD discussions I think speaks to the commitment.

https://github.com/oobabooga/text-generation-webui/issues/1575

FastChat is suppose to support multiple Arcs. But since I only have one I can't confirm that.

"The most notable options are to adjust the max gpu memory (for A750 --max-gpu-memory 7Gib) and the number of GPUs (for multiple GPUs --num-gpus 2). "

https://github.com/itlackey/ipex-arc-fastchat/blob/main/README.md

[–] CheatCodesOfLife@alien.top 1 points 11 months ago (1 children)

Have you tried it though? I've been trying for a few months / updates and it doesn't work.

[–] fallingdowndizzyvr@alien.top 1 points 11 months ago (1 children)

With Oob, I've tried, but haven't been successful. But I wasn't successful getting it to work with my 2070 either months ago. I gave up on it, switched to llama.cpp and didn't look back. Until now. So I have tried getting it to work with the A770. But as pointed out in the discussion, there's that issue. But I haven't tried the workaround posted a couple of days ago.

[–] CheatCodesOfLife@alien.top 1 points 11 months ago (1 children)

I'm finding I prefer llama.cpp now as well (the last few days), though for work I usually use Oob + gptq.

If you have it handy, could you post the compile command you used to get llamacpp built for the A770?

[–] fallingdowndizzyvr@alien.top 1 points 11 months ago (1 children)

It's just the normal OpenCL and Vulkan compile flags. So "make LLAMA_CLBLAST=1" and "make LLAMA_VULKAN=1". You will have to download the Vulkan PR for Vulkan. Bu as I said, it's painfully slow. Like slower than the CPU. So not worth it. Both are equally as slow so there seems to be something in common that is not A770 friendly. Although I haven't tried Vulkan in a couple of weeks so that might be better. I even tried giving it the Intel specific OpenCL more than 4GB option but that didn't make any difference at all.

[–] CheatCodesOfLife@alien.top 1 points 11 months ago (1 children)

Right. Kind of feels like Intel are leaving money on the table by not writing software for this lol

[–] fallingdowndizzyvr@alien.top 1 points 11 months ago

They did. That's why software that uses Pytorch like FastChat and SD work very well with Intel Arc. But llama.cpp doesn't use Pytorch.

Here's the base of their software. An API that they are pushing as a standard since it also supports nvidia and AMD as well.

https://www.oneapi.io/

Also, Intel has their own package of LLM software.

https://github.com/intel-analytics/BigDL