this post was submitted on 22 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

I'm trying to assess whether or not to try and run a second GPU in my second full length slot. My mobo manual reports the fatest the second slot can go is pci-e 2.0 at x4 lanes. A paltry 2GB/s correct?

Can anyone comment from personal experience?

top 5 comments
sorted by: hot top controversial new old
[–] Aaaaaaaaaeeeee@alien.top 1 points 10 months ago
[–] platinums99@alien.top 1 points 10 months ago

depends on the board, seek the manual, rtfm :D

and if turboderp is right, it largely doenst matter.

[–] a_beautiful_rhind@alien.top 1 points 10 months ago

For exllama not much, for others a bit. On llama.cpp I lose 10% by halving the bandwidth.

[–] NoWarrenty@alien.top 1 points 10 months ago (1 children)

It is very important if you care about performance. On inference, a lot of data has to go from one card to another. I was using 1x risers and it sucked. If you have two similar nvidia cards, you can get around by using nvlink bridge.

Otherwise you should aim at pcie 4 8x at least when looking for a Mainboard. I sniped a epyc system from ebay for 1000€ that has 6 pcie 4 16x and it rocks it all with 4 3090.

https://preview.redd.it/u0bvy2kkzw1c1.jpeg?width=4032&format=pjpg&auto=webp&s=ecb164bbf59504e590c19403554e24df8f9236c8

[–] Massive_Robot_Cactus@alien.top 1 points 10 months ago

Was the cpu from ebay too? Any reliability issues? It seems a lot of the cheap ones on ebay are gray market / production candidates.