this post was submitted on 27 Nov 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I have 64 cores with 8ch ram, if i use more than 24-32 cores the speed slows down somewhat.
This is for token generation, prompt processing benefits form all the threads.
But it is much better to spend your money on gpus than cpu cores, i have 3X Radeon MI25 in a i9 9900k box, and that is more than twice as fast as the 64 core epyc build