LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Comparing 4060 Ti 16GB + DDR5 6000 vs 3090 24GB: looking for 34B model benchmarks (alien.top)

submitted 2 years ago by regunakyle@alien.top to c/localllama@poweruser.forum

8 comments fedilink hide all child comments

I am going to build a LLM server very soon, targeting 34B models (specifically phind-codellama-34b-v2.Q4 GGUF GPTQ AWQ).

I am stuck between these two setups:

12400 + DDR5 6000MHz 30CL + 4060 Ti 16GB (GGUF; Split the workload between CPU and GPU)
3090 (GPTQ/AWQ model fully loaded in GPU)

Not sure if the speed bump of 3090 is worth the hefty price increase. Does anyone have benchmarks/data comparing these two setups?

BTW: Alder Lake CPUs run DDR5 in gear 2 (while AM4 run DDR5 in gear 1). AFAIK gear 1 offers lower latency. Would this give AM4 big advantage when it comes to LLM?

you are viewing a single comment's thread
view the rest of the comments

[–] FutureIsMine@alien.top 1 points 2 years ago

you gotta look at the internal memory clocks and data transfer rates on the GPUS and what you're going to see is that only the XX80 and XX90 cards have enough memory bandwidth to transfer all that vRAM so the 4060 with all that vRAM can't actually move that much memory around