LocalLLaMA
Community to discuss about Llama, the family of large language models created by Meta AI.
depends on the board, seek the manual, rtfm :D
and if turboderp is right, it largely doenst matter.
For exllama not much, for others a bit. On llama.cpp I lose 10% by halving the bandwidth.
It is very important if you care about performance. On inference, a lot of data has to go from one card to another. I was using 1x risers and it sucked. If you have two similar nvidia cards, you can get around by using nvlink bridge.
Otherwise you should aim at pcie 4 8x at least when looking for a Mainboard. I sniped a epyc system from ebay for 1000€ that has 6 pcie 4 16x and it rocks it all with 4 3090.
Was the cpu from ebay too? Any reliability issues? It seems a lot of the cheap ones on ebay are gray market / production candidates.