this post was submitted on 10 Nov 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 11 months ago
MODERATORS
 

I have access to a single 80Gb A100 GPU and would like to train an LLM with GPT-like architecture from scratch. Does anyone know how to calculate the maximum model size.

you are viewing a single comment's thread
view the rest of the comments
[–] Ok-Equipment9840@alien.top 1 points 10 months ago

Depends on how many tokens you have?