0.000575
that is nearly 2.1$ per hour. on https://runpod.io, you could get an a40 for 0.79$ / hr. for a 34b model, 24gb vram is more than enough so you could get a A5000 for around 0.44$ / hr
0.000575
that is nearly 2.1$ per hour. on https://runpod.io, you could get an a40 for 0.79$ / hr. for a 34b model, 24gb vram is more than enough so you could get a A5000 for around 0.44$ / hr
1: What is your budget?
2: Do you have access to the data center and are you able to put in a GPU?
3: Does the server have SXM sockets or just PCIE?
Mi25 is also an option, but a lot of programs are a lot more optimized for CUDA devices
P100s are also an okay-ish choice for super-budget builds (sxm is only 50$ but pcie is ~150$), but doesn't output video. It has a higher mem bandwidth as it use HBM instead of GDDR, and is faster than the p40 by a lot at 19.05 TFLOPS for FP16.
you could get a gpu like a p100 16gb for simple ai work or a v100/A4 for slightly more heavy duty work
p100s only cost around 170$, so it's cheap to upgrade the gpu