this post was submitted on 02 Apr 2025
47 points (88.5% liked)

Selfhosted

45419 readers
593 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

Not sure if this is the right place, if not please let me know.

GPU prices in the US have been a horrific bloodbath with the scalpers recently. So for this discussion, let's keep it to MSRP and the lucky people who actually managed to afford those insane MSRPs + managed to actually find the GPU they wanted.

Which GPU are you using to run what LLMs? How is the performance of the LLMs you have selected? On an average, what size of LLMs are you able to run smoothly on your GPU (7B, 14B, 20-24B etc).

What GPU do you recommend for decent amount of VRAM vs price (MSRP)? If you're using the TOTL RX 7900XTX/4090/5090 with 24+ GB of RAM, comment below with some performance estimations too.

My use-case: code assistants for Terraform + general shell and YAML, plain chat, some image generation. And to be able to still pay rent after spending all my savings on a GPU with a pathetic amount of VRAM (LOOKING AT BOTH OF YOU, BUT ESPECIALLY YOU NVIDIA YOU JERK). I would prefer to have GPUs for under $600 if possible, but I want to also run models like Mistral small so I suppose I don't have a choice but spend a huge sum of money.

Thanks


You can probably tell that I'm not very happy with the current PC consumer market but I decided to post in case we find any gems in the wild.

you are viewing a single comment's thread
view the rest of the comments
[–] Natanox@discuss.tchncs.de 4 points 1 day ago (1 children)

I'm currently looking for this as well. As far as my investigation went right now I'll probably go for 2x AMD Instinct MI50. Each of them has equivalent to slightly higher performance than a P40, however usually only 16gb VRAM (If you're super lucky you might get one with 32gb, those are usually not labeled as such though; probably binned MI60). With two of them you got 32gb VRAM and quite the performance for, right now, 200€ / card. Alternatively you should be able to run quantized models on a single card as well.

If you don't mind running ROCm instead of CUDA this seems like a good bang for the buck. Alternatively you might look into AMDs new line of "AI" SoCs (for example Frameworks Desktop computer). They seem to be really good as well, and depending on your usecase might be more useful than an equally priced 4090.

[–] marauding_gibberish142@lemmy.dbzer0.com 1 points 1 day ago (1 children)

Do you have 2 PCIE X16 slots on your motherboard (speaking in terms of electrical connections)?

[–] Natanox@discuss.tchncs.de 1 points 1 day ago (1 children)

They would run with 8x speed each. Should not be too much of a bottleneck though, I don't expect the performance to suffer noticeably more than 5% from this. Annoying, but getting a CPU+Board with 32 lanes or more would throw off the price/performance ratio.

I have an alternative for you if your power bills are cheap: X99 motherboard + CPU combos from China