this post was submitted on 22 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

I'm trying to assess whether or not to try and run a second GPU in my second full length slot. My mobo manual reports the fatest the second slot can go is pci-e 2.0 at x4 lanes. A paltry 2GB/s correct?

Can anyone comment from personal experience?

you are viewing a single comment's thread
view the rest of the comments
[–] a_beautiful_rhind@alien.top 1 points 11 months ago

For exllama not much, for others a bit. On llama.cpp I lose 10% by halving the bandwidth.