CertainlyBright

joined 11 months ago
[–] CertainlyBright@alien.top 1 points 10 months ago

Could you elaborate on "disabling 16 bit floats" alittle bit more?

 

I started with running quantized 70B on 6x P40 gpu's, but it's noticeable how slow the performance is. Sure maybe I'm not going to buy a few A100's to replace them. But what about an RTX8000 or two?

These will be going into a 2u gigabyte gpu server, so consumer cards don't fit