this post was submitted on 30 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

Running multiple GPUs requires PCIe lanes. Consumer PCs have too few of those to even run 2x GPUs at full bandwidth (2x16).

Threadrippers are prohibitively expensive for many.

AMD have announced EPYC 8004 Siena in September. These low-power server CPUs start at 8 cores @ ~$400 and offer 96 lanes. The catch is that the clock is pretty low.

So, the question is: How bottlenecked are LLMs by CPU clock?

I.e., would it make much of a difference if you run 4x 3090s on the $2000+ Threadripper vs $400 Epyc 8004?

you are viewing a single comment's thread
view the rest of the comments
[โ€“] ThisGonBHard@alien.top 1 points 11 months ago (1 children)

Pretty much not at all. The main bottleneck is memory speed.

I barely see a difference between 4 and 12 cores on 5900X when running on CPU.

When running multi GPU, the lanes are the biggest bottleneck.

On single GPU, CPU does not matter.

[โ€“] _Erilaz@alien.top 1 points 11 months ago

8004 has six DDR5 channels afaik. That takes care of the memory bandwidth. The only issue would be an SP6 motherboard.