this post was submitted on 01 Nov 2023
1 points (100.0% liked)
Machine Learning
1 readers
1 users here now
Community Rules:
- Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
- Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
- Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
- Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'm a ML scientist with a modest 18 h-index and I have used both Mac and Nvidia GPUs. Currently I have a M1 Max and a workstation with 6000 Ada Lovelace (the professional version of 4090) + Ryzen 9 5950X. so the first year was really tough for apple silicon but now pytorch and tensorflow both work very well on apple silicon. My lab also has A100 servers. Most of my time nowadays I'm using pytorch with mps (metal performance shader) on Mac and have almost no issues with it. From my own testing with spatial-temporal series and vision, M1 Max was about half as fast as 6000 Ada (which is insane because M1 Max GPU consumes 35 Watt). We also have M1 Pro with 16GB RAM, which could be interesting for comparison for the model you are looking at. They run the same code about half as slow as M1 Max, it seems it scales with the GPU cores or GPU render capability. I'm also looking to upgrade to M3 Max and it seems the GPU metal performance sometimes double M1 Max or matches M2 Ultra, would be very interesting to see if it could match Ada in ML.
Also, we used apples' powermetrics tool to check the power usage, GPU is used for model training, the apple neural engine is at 0 watt during training. I think the ANE is only used for inference for production level apps like Adobe's or Da'vinci's AI functions.
RAM: For the same GPU accessible RAM, Mac is more cost-efficient than professional Nvidia cards, and Mac goes way higher than what Nvidia cards can touch. My 2 year old M1 Max has 64GB and at the time of purchase there was no PCIe cards come even close to it, the closest is A6000 Ampere with 48GB VRAM and the card alone costs more than the Macbook. Keep in mind the os and dataloader that runs on CPU might take 5-8 GB, if you don't have other RAM consuming program running.
Pytorch installation on Macbook is simply one line of code and you don't have to deal with cudnn / cuda versions, Apple took care of everything about mps. The bliss of it just works.
Sometimes I like to run the same code both on cuda and mps, I can recount some minor problems I've recently encountered compared to cuda:
My field is not LLM but for the basics like Bert according to this post checking how much memory it takes on cuda, a 36GB Mac shouldn't be a problem https://huggingface.co/docs/transformers/v4.18.0/en/performance
But I cannot say about more modern LLMs, like meta Llama the small version seems to take 13GB for the model alone.
I'm not trying to sell a mac. I'd say with the performance I hinted at, you can consider a PC system at your budget, and see which GPU it will get you, then look for performance scalers from that GPU to 6000 Ada. I'd consider the M3 Pro 16-core GPU's render performance relative to M1 Max - it seems we don't have a full picture yet but I've seen numbers in the range of 70% to 80%? Depending on the benchmark tasks. So if that PC system's GPU is more than 35% of 6000 Ada, it could outperform the M3 Pro you are looking at considering only the training time.
But a Macbook in other regards is like years ahead of other PCs (I am a PC gamer too and this is not a biased opinion). During model training with full GPU utilization on the 16 inch the computer is quieter than normal windows laptops on standby - I hear my own breath before I can hear the fan from an arm's distance. The display is gorgeous to look at. The battery lasts forever. It will not auto-update during your unattended overnight training. But again, you don't have many games on Mac (yet).
Hope this helps.
Thanks for sharing your perspectives! One thing which makes me listen to the "fearmongering" of ARM, is this specific issue which has been open for long: https://github.com/DLR-RM/stable-baselines3/issues/914
That is only an example, but that is the library (in addition to LLMs) I use now :)