this post was submitted on 08 Nov 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
There's really no comparison. The 4060s, even the Ti, have crap for memory bandwidth. 288GB/s in the case of the Ti. DDR5 is also not fast enough to make much difference. So that combo is not going to be speedy. It in no way compares to a 3090.
The real issue is that the consumer cpus / motherboards have very few lanes. DDR5 is plenty fast, but you are probably maxing out motherboard bandwidth with two sticks.
Would not surprise me at all if server CPU inference is somewhere between x3 and x5 times faster.