this post was submitted on 21 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

I am looking to get a MI60 for both LLMs and other high compute tasks as some are going for $350 on ebay. It looks like a really good deal for my applications and with the 32GBs of RAM, but I was wondering what others have experienced with it for use with LLMs. I am curious on how compatibility was for OpenCL or ROCm, I mainly use Windows so am wondering if I can still use it with most of its speed through windows, and what kind of speeds people are getting using it with models.

Thank you!

top 4 comments
sorted by: hot top controversial new old
[–] Super-Strategy893@alien.top 1 points 10 months ago (1 children)

And I use an MI50 for AI tasks, including testing LLMs. Performance is 34tokens/s on 13B models. 33B models the speed drops to around 8 tokens/s. the MI50 only has 16GB of VRAM.

ROCm compatibility has improved a lot this year, openCL support is very good. Even openMP's offload support is very good and I'm using it in some personal projects, the use of HBM2 memory gives a good boost in certain computing-intensive tasks.

However, this does not apply to Windows, it is still very unstable and MI50/60 are not officially supported. The second option is to use DirectML, but all the solutions seem to be a house of cards that anything causes the system to stop working.

An important observation is the bios used on these boards. The ones I have have two bios installed, one of them is the modified mining version that causes abnormal heating, changing the bios switch, everything returns to normal.

[–] fallingdowndizzyvr@alien.top 1 points 10 months ago

The ones I have have two bios installed, one of them is the modified mining version that causes abnormal heating, changing the bios switch, everything returns to normal.

That's what I love about these "Vega" generation AMD cards. They support two BIOSes so it's pretty impossible to brick one by flashing the BIOS. Just make sure you have one working at all times and switch back to that.

I don't know about the MI50/60 but the MI25 can be flashed to be a 16GB Vega 64 or a WX9100. The WX9100 is the way to go since that enables the caged mini DP. Thus it can be used as a real GPU. You need to use the WX9100 BIOS since the card supports 6 video outs. The caged mini DP is in the 6th slot. The WX9100 BIOS is the one that supports all video outs and thus the 6th one. The Vega does not.

[–] Amgadoz@alien.top 1 points 10 months ago

u/ehartford

[–] tu9jn@alien.top 1 points 10 months ago

I run 3X MI25, a 70b q4_k_m model starts from 7t/s and slows to ~3 t/s at full context. 7b_f16 is about 18t/s. As far as i know the Mi series only have linux drivers.