0xd00d

joined 11 months ago
[–] 0xd00d@alien.top 1 points 9 months ago (1 children)

I would imagine that this new option you're talking about will be a good budget inference workhorse paired with multiple cards such as 3090s. 96 lanes of gen 5 will be a real enabler. That said, I think zen 2 epycs providing gen 4 lanes are cheaper still so there are good options available.

[–] 0xd00d@alien.top 1 points 10 months ago

Be sure to prioritize the 3090s pcie lanes

[–] 0xd00d@alien.top 1 points 10 months ago

I suppose the real big thing factoring into scalability isn't necessarily CUDA, but TensorRT, which, yes is built on top of CUDA... I haven't been keeping up with the actual hardware capabilities in AMD's stuff wrt tensor cores, but basically what we're seeing is TensorRT is able to better utilize nvidia's tensor cores and extract much more out of the available memory bandwidth... if AMD can get close (it seems like we can only hope for them to get close), if they can produce significantly beefier hardware that sells for less, and the software can actually come close (this is the crux of it) then we may have some real competition