CertainlyBright

joined 2 years ago

Buying a p40 for 70b-120b in c/localllama@poweruser.forum

[–] CertainlyBright@alien.top 1 points 2 years ago

Could you elaborate on "disabling 16 bit floats" alittle bit more?

permalink
fedilink
source
context

Picking out the next gpu's to buy, after using P40's for running LLama (alien.top)

submitted 2 years ago by CertainlyBright@alien.top to c/localllama@poweruser.forum

1 comments fedilink

I started with running quantized 70B on 6x P40 gpu's, but it's noticeable how slow the performance is. Sure maybe I'm not going to buy a few A100's to replace them. But what about an RTX8000 or two?

These will be going into a 2u gigabyte gpu server, so consumer cards don't fit