this post was submitted on 18 Nov 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
8bit? 4-bit qlora? You can train 34B models on 24GB. You might need to set up deepspeed if you want to use both, or just train on 24GB card. PSA if you are using axolotl - disabling sample packing is required to enable flash attention 2 and, otherwise flash attention will simply not be enabled. This can spare you some memory. I can train Yi-34B QLoRA with rank 16, ctx 1100 (and maybe some more) on 24GB Ampere card