this post was submitted on 22 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

I'm trying to assess whether or not to try and run a second GPU in my second full length slot. My mobo manual reports the fatest the second slot can go is pci-e 2.0 at x4 lanes. A paltry 2GB/s correct?

Can anyone comment from personal experience?

you are viewing a single comment's thread
view the rest of the comments
[–] a_beautiful_rhind@alien.top 1 points 10 months ago

For exllama not much, for others a bit. On llama.cpp I lose 10% by halving the bandwidth.