this post was submitted on 28 Nov 2023
1 points (100.0% liked)
LocalLLaMA
3 readers
1 users here now
Community to discuss about Llama, the family of large language models created by Meta AI.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
It took 3,311,616 hours of training for the llama2 70b base model. At $1/hour for an A100 GPU you’d spend just over $3M and it would take approximately 380 years to train the model.
Scale that across 10,000 GPUs and you’re looking at 2 weeks and a couple of million dollars.
Fine tune training is much, much faster and cheaper.
$1/hour for an A100 ? Where? I can barely get one in GCE and it’s almost 4$ / hr
Yes, but you don't have Meta's purchasing power to rent 10,000 GPUs for a month. Economies of scale, my friend!