this post was submitted on 15 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

So in the last few weeks i have been experimenting with LLMs on my personal laptop (as I'm rarely at home) but I'm gonna have my pc with me in a few days. When running models (MythoMax 13b, mostly Q6_K and Q5_K_M GGUF) I can definitely feel my laptop not liking it. Slowdowns, crashes, service terminations and timeouts.

Now, the situation is this, I have unexpectedly gotten some money which i want to invest in PC parts.
My PC currently has 16GB of DDR5 Ram and a GTX 1070 with 8GB VRAM.
The idea now is to buy a 96GB Ram Kit (2x48) and Frankenstein the whole pc together with an additional Nvidia Quadro P2200 (5GB Vram).

Would the whole "machine" suffice to run models like MythoMax 13b, Deepseek Coder 33b and CodeLlama 34b (all GGUF)

Specs after: 112GB DDR5, 8GB VRAM and 5GB VRAM, CPU is a Ryzen 5 7500F

And the question i should have asked first, can the GTX 1070 and P2200 setup even work, like would text gen webui even detect both cards?

Sorry if thats a dumb question

top 4 comments
sorted by: hot top controversial new old
[–] Arkonias@alien.top 1 points 11 months ago

Save the $$$ for a few months and go and buy a used 3090 or two. It'll be worth it in the long run, and save any headaches of trying to frakenstein a bunch of 8 GB cards together.

[–] ccbadd@alien.top 0 points 11 months ago (1 children)

I would replace the DDR5 ram rather than add to it or your memory will run a lot slower and you just don't need it if you're going to use gpus for inferencing. Also, a P40 is probably money better spent with this config than the P2200.

[–] Wortkraecker@alien.top 0 points 11 months ago (1 children)

Thing is, I have the P2200 sitting in my shelf rn from my dads old workstation, so I wouldn't have to buy it.

[–] a_beautiful_rhind@alien.top 1 points 11 months ago

13gb does not make for much. Especially when part of it is used for graphics and all old pascal architecture.

By all means just put the card is and see where it gets you on 13b.