LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

How important is pci-e speed/gen/lanes when doing inference? (alien.top)

submitted 2 years ago by wh33t@alien.top to c/localllama@poweruser.forum

5 comments fedilink hide all child comments

I'm trying to assess whether or not to try and run a second GPU in my second full length slot. My mobo manual reports the fatest the second slot can go is pci-e 2.0 at x4 lanes. A paltry 2GB/s correct?

Can anyone comment from personal experience?

you are viewing a single comment's thread
view the rest of the comments

[–] a_beautiful_rhind@alien.top 1 points 2 years ago

For exllama not much, for others a bit. On llama.cpp I lose 10% by halving the bandwidth.

permalink
fedilink
source