this post was submitted on 27 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

So I'm looking into Threadripper pro systems, which can offer a pretty good memory bandwidth as they are 8 channel, and can have a huge amount of RAM. (I can put a 3090 or two in there too.)

I'm wondering how much the core count is going to affect performance. For example, the 5955WX has 16 cores while the 5995WX has 64 cores. They can both use the same memory though. There's little point spending extra if the limiting factor will be somewhere else.

top 2 comments
sorted by: hot top controversial new old
[โ€“] jeffwadsworth@alien.top 1 points 9 months ago

I use a Ryzen 12 core and can use llama.cpp with the 70b 8bit fine. Do not bother with hyper-threads, though.

[โ€“] tu9jn@alien.top 1 points 9 months ago

I have 64 cores with 8ch ram, if i use more than 24-32 cores the speed slows down somewhat.

This is for token generation, prompt processing benefits form all the threads.

But it is much better to spend your money on gpus than cpu cores, i have 3X Radeon MI25 in a i9 9900k box, and that is more than twice as fast as the 64 core epyc build