this post was submitted on 08 Nov 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The real issue is that the consumer cpus / motherboards have very few lanes. DDR5 is plenty fast, but you are probably maxing out motherboard bandwidth with two sticks.
Would not surprise me at all if server CPU inference is somewhere between x3 and x5 times faster.