this post was submitted on 25 Feb 2025
501 points (98.6% liked)

Technology

63277 readers
4897 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] brucethemoose@lemmy.world 44 points 19 hours ago* (last edited 19 hours ago) (2 children)

The 5090 is basically useless for AI dev/testing because it only has 32GB. Mind as well get an array of 3090s.

The AI Max is slower and finicky, but it will run things you'd normally need an A100 the price of a car to run.

But that aside, there are tons of workstations apps gated by nothing but VRAM capacity that this will blow open.

[–] KingRandomGuy@lemmy.world 20 points 13 hours ago (2 children)

Useless is a strong term. I do a fair amount of research on a single 4090. Lots of problems can fit in <32 GB of VRAM. Even my 3060 is good enough to run small scale tests locally.

I'm in CV, and even with enterprise grade hardware, most folks I know are limited to 48GB (A40 and L40S, substantially cheaper and more accessible than A100/H100/H200). My advisor would always say that you should really try to set up a problem where you can iterate in a few days worth of time on a single GPU, and lots of problems are still approachable that way. Of course you're not going to make the next SOTA VLM on a 5090, but not every problem is that big.

[–] brucethemoose@lemmy.world 2 points 4 hours ago* (last edited 4 hours ago) (1 children)

Fair. True.

If your workload/test fits in 24GB, that's already a "solved" problem. If it fits in 48GB, it's possibly solved with your institution's workstation or whatever.

But if it takes 80GB, as many projects seem to require these days since the A100 is such a common baseline, you are likely using very expensive cloud GPU time. I really love the idea of being able to tinker with a "full" 80GB+ workload (even having to deal with ROCM) without having to pay per hour.

[–] wise_pancake@lemmy.ca 2 points 1 hour ago (1 children)

This is my use case exactly.

I do a lot of analysis locally, this is more than enough for my experiments and research. 64 to 96gb VRAM is exactly the window I need. There are analyses I've had to let run for 2 or 3 days and dealing with that on the cloud is annoying.

Plus this will replace GH Copilot for me. It'll run voice models. I have diffusion model experiments I plan to run but are totally inaccessible locally to me (not just image models). I've got workloads that take 2 or 3 days at 100% CPU/GPU that are annoying to run in the cloud.

This basically frees me from paying for any cloud stuff in my personal life for the foreseeable future. I'm trying to localize as much as I can.

I've got tons of ideas I'm free to try out risk free on this machine, and it's the most affordable "entry level" solution I've seen.

[–] brucethemoose@lemmy.world 2 points 58 minutes ago* (last edited 54 minutes ago)

And even better, "testing" it. Maybe I'm sloppy, but I have failed runs, errors, hacks, hours of "tinkering," optimizing, or just trying to get something to launch that feels like an utter waste of an A100 mostly sitting idle... Hence I often don't do it at all.

One thing you should keep in mind is that the compute power of this thing is not like an A/H100, especially if you get a big slowdown with rocm, so what could take you 2-3 days could take over a week. It'd be nice if framework sold a cheap MI300A, but... shrug.

[–] KeenFlame@feddit.nu 1 points 7 hours ago

Exactly, 32 is plenty to develop on, and why would you need to upgrade ram? It was years ago I did that in any computer let alone a tensor workstation. I feel like they made pretty good choices for what it's for

[–] felixwhynot@lemmy.world 2 points 12 hours ago (2 children)

… but only OpenCL workloads, right?

[–] brucethemoose@lemmy.world 1 points 2 hours ago

Not exactly. OpenCL as a compute framework is kinda dead.

[–] amon@lemmy.world 2 points 10 hours ago

No, it runs off integrated graphics, which is a good thing because you can have a large capacity of ram dedicated to GPU loads