LocalLLaMA

14 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Is it possible to fine tune a 33B model with 48GB vRAM? (alien.top)

submitted 2 years ago by tgredditfc@alien.top to c/localllama@poweruser.forum

10 comments fedilink hide all child comments

Currently I have 12+24GB VRAM and I get Out Of Memory all the time when try to fine tune 33B models. 13B is fine, but the outcome is not very good so I would like to try 33B. I wonder if it’s worthy to replace my 12GB GPU with a 24GB one. Thanks!

you are viewing a single comment's thread
view the rest of the comments

[–] Updittyupup@alien.top 1 points 2 years ago (1 children)

I think you may need to try to shard optimizer state and gradient. I've been using DeepSpeed and have had some good success. Here is a writeup that compares the different DeepSpeed iterations: [RWKV-infctx] DeepSpeed 2 / 3 comparisons | RWKV-InfCtx-Validation – Weights & Biases (wandb.ai). Look at the bottom of article for an accessible overview. I'm not the author, and I haven't validated the findings. I think more distributed tools are getting more and more necessary. I suppose the option is quantization but may risk quality loss. Here is a discussion on that: https://www.reddit.com/r/LocalLLaMA/comments/153lfc2/quantization_how_much_quality_is_lost/

[–] tgredditfc@alien.top 1 points 2 years ago

Thank you! It looks very deep to me, I will look into it.