freaklemur

joined 10 months ago
[–] freaklemur@alien.top 1 points 10 months ago (2 children)

The compute limit is 10^26 operations. For reference, NVIDIA trained GPT-3 on ~3500 H100s in just under 11 minutes which, assuming FP8 which is the highest op count, comes out to ~10^22 operations. With the same setup, they'd have to train for over 81 days to reach the 10^26 limit so it's likely not going to impact anyone except for those training incredibly large models.

Edit: MLPerf link