LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Buying a p40 for 70b-120b (alien.top)

submitted 2 years ago by asdfgbvcxz3355@alien.top to c/localllama@poweruser.forum

10 comments fedilink hide all child comments

My current specs are 7950x cpu, 96gb 6000 mhz ram and a rtx 4090. Would a buying a p40 make bigger models run noticbly faster? If it does is there anything I should know about buying p40's? Like do they take normal connectors or anything like that. I'm pretty sure I'll have to buy a cooler for it but idk what else.

top 10 comments

sorted by: hot top controversial new old

[–] asdfgbvcxz3355@alien.top 1 points 2 years ago (2 children)

Also are there better alternatives to the p40 that don't cost a crazy amount?

[–] Any_Elderberry_3985@alien.top 1 points 2 years ago (1 children)

3090 is ~$650 and will be a lot faster and pair better with a 4090

[–] Boring_Isopod2546@alien.top 1 points 2 years ago (2 children)

I keep seeing these low numbers. Where are people finding 3090s that cheap? I was lucky to get a pair at about 900 each, but most places I see them listed, even refurbished ones are $800-1K.

[–] fallingdowndizzyvr@alien.top 1 points 2 years ago (1 children)

If you get lucky, you can get a 3090 for about $650 on ebay. But you have to be patient.

even refurbished ones are $800-1K.

You can get open box ones with 2 year warranty for less than that. Zotac had open box 3090s for like $729(If I remember right) earlier this week. If you get really lucky they rarely have the watercooled ready ones new for $799. But you have to supply your own block.

[–] 218-69@alien.top 1 points 2 years ago

Yea I got a watercooled zotac trinity from ebay, around 800 bucks but for EU that's good.

[–] Any_Elderberry_3985@alien.top 1 points 2 years ago

Yea, I got mine by using an eBay auction using a sniper tool patience. Make sure to look at the seller and auctions.

[–] --Gen-@alien.top 1 points 2 years ago

The P40 is about as cost effective as it gets. But it's a bit of work to get it working. Also keep in mind that air cooling a P40 is either ineffective or extremely loud.

[–] simcop2387@alien.top 1 points 2 years ago (1 children)

The big issue is that you're going to have to disable 16bit floats for doing all the work and do it all in 32bit floats (not storing weights, but the calculations themselves) once you try to combine with a P40, you can still get alright performance on them (I'm using 4 of them) but you'll cripple the performance of the 4090 doing that. I don't know if any of the libraries for running things will handle conversion and different kernels on different cards to avoid that since it's a completely different set of code for that.

You'd do much much better with adding a used 3090 from ebay (assuming it works) really.

[–] CertainlyBright@alien.top 1 points 2 years ago

Could you elaborate on "disabling 16 bit floats" alittle bit more?

[–] DrVonSinistro@alien.top 1 points 2 years ago

I run 2x P40s with 70b chat and 8k ctx I get 7-8 T/s and I'm very happy with that. Anything above 5 is awesome for me.