this post was submitted on 28 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

Sorry for the noob question. I’m building out a new server and as I love playing with new tech, I thought I would throw in a GPU so I can try learn to integrate AI with things like Private GPT, document generation, meeting transcription, maybe some integrations with Obsidian, or even Home Assistant for automation. I like the idea of it being able to crawl all my information and offer suggestion, rather than me having to copy and paste snippets as I do now with Chat GPT. I’m a solo IT consultant by trade, so I’m really hoping it will help me augment my work.

Budget isn’t super important, it more that it’s fit for purpose, but to stop the people suggesting a £30,000 GPU, I cap it at ~£1000!

Thanks!

top 10 comments
sorted by: hot top controversial new old
[–] flossraptor@alien.top 1 points 11 months ago (1 children)

Even if you get a 3090 with 24gb of vram, you're going to load the biggest model you can and realize it is useless for most tasks. Less than that and I don't even know what you would use it for.

[–] idarryl@alien.top 1 points 11 months ago

Ok, you make a really good point. I’m starting out, you know a thing or two. How would you start out, knowing what you know now?

[–] a_beautiful_rhind@alien.top 1 points 11 months ago (1 children)

P40, 3090, those are your "affordable" 24gb GPU unless you want to go AMD or have enough to make 3x16gb or something.

[–] idarryl@alien.top 1 points 11 months ago (1 children)

I have a soft spot for AMD and I’m considering the R9 7900 or (lesser known) R9 PRO 7945 (it has AVX-512 support, which I understand is beneficial in the ML world) for the CPU, but heard AMD GPU support was lack in some workloads and I really don’t need that learning curve.

I can only afford and justify the space of one CPU so 3 x something is out of the question.

I looked at the P100, but read the 4060 was the better choice. I’m at a little bit of a loss. A 3u chassis looks like the way to go, and I have the option to water cool either the CPU or GPU.

[–] a_beautiful_rhind@alien.top 1 points 11 months ago

I just got a P100 for like $150, going to test it out and see how it does with its FP16 vs P40 for SD and exllama overflow.

4060 is faster but its multiple times as expensive. For your sole GPU you really need 24gb+. The AMD are becoming somewhat competitive but still have some hassle and slowness.

CPU is going to give you 3t/s, its not really anywhere near, even with the best procs. Sure get it for other things in the system, but don't expect it to help much with ML. I guess newer will get you faster ram but it's not enough.

[–] theyreplayingyou@alien.top 1 points 11 months ago (1 children)

That depends, when you say you are building out a new server, are we talking a proper 1 or 2u dell, HPE, etc type server? If so you'll have to contend with the GPU footprint, for example my 1u servers can only take up to 2, half height, half length GPU's, and they can only be powered by PCIE so I'm limited to 75w.

In my 2u servers I can get the "GPU enablement kit" which is essentially smaller form factor heatsinks for the CPU's and some long 8pin power connectors to go from the the mobo to the PCIE riser, allowing many more options, but still there are problems to address with heat, power draw (CPUs are limited to 130TDP I believe) and the server firmware complaining about the GPU/forcing the system fans to run at an obnoxious level, etc...

If you are homebrewing a 3u, a tower or using consumer parts than things change quite a bit.

[–] idarryl@alien.top 1 points 11 months ago

Home brewing is exactly the way I’m going. A 3u is on the table, I’m looking at the sliger chassis’s at the moment, but cooling a space is a big consideration.

To be honestly I had only considered 1 GPU, I live in the UK and electricity prices are high (and houses are small), so multiple CPUs just don’t seem appropriate.

[–] Prudent-Artichoke-19@alien.top 1 points 11 months ago (1 children)

You can cluster 3 16GB Arc A770 GPUs. That's 48GB and modern.

[–] idarryl@alien.top 1 points 11 months ago (1 children)

3 x anything is not an option. I’m 1 and done. 👍🏻

[–] Prudent-Artichoke-19@alien.top 1 points 11 months ago

It just sucks because the sweet spot is 48GB but a single card is 3k usd at least.

At 1k you'll be stuck at 24GB for a single card.