this post was submitted on 15 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

So in the last few weeks i have been experimenting with LLMs on my personal laptop (as I'm rarely at home) but I'm gonna have my pc with me in a few days. When running models (MythoMax 13b, mostly Q6_K and Q5_K_M GGUF) I can definitely feel my laptop not liking it. Slowdowns, crashes, service terminations and timeouts.

Now, the situation is this, I have unexpectedly gotten some money which i want to invest in PC parts.
My PC currently has 16GB of DDR5 Ram and a GTX 1070 with 8GB VRAM.
The idea now is to buy a 96GB Ram Kit (2x48) and Frankenstein the whole pc together with an additional Nvidia Quadro P2200 (5GB Vram).

Would the whole "machine" suffice to run models like MythoMax 13b, Deepseek Coder 33b and CodeLlama 34b (all GGUF)

Specs after: 112GB DDR5, 8GB VRAM and 5GB VRAM, CPU is a Ryzen 5 7500F

And the question i should have asked first, can the GTX 1070 and P2200 setup even work, like would text gen webui even detect both cards?

Sorry if thats a dumb question

you are viewing a single comment's thread
view the rest of the comments
[–] ccbadd@alien.top 0 points 11 months ago (1 children)

I would replace the DDR5 ram rather than add to it or your memory will run a lot slower and you just don't need it if you're going to use gpus for inferencing. Also, a P40 is probably money better spent with this config than the P2200.

[–] Wortkraecker@alien.top 0 points 11 months ago (1 children)

Thing is, I have the P2200 sitting in my shelf rn from my dads old workstation, so I wouldn't have to buy it.

[–] a_beautiful_rhind@alien.top 1 points 11 months ago

13gb does not make for much. Especially when part of it is used for graphics and all old pascal architecture.

By all means just put the card is and see where it gets you on 13b.