this post was submitted on 27 Oct 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 11 months ago
MODERATORS
 

What are the benefits of using an H100 over an A100 (both at 80 GB and both using FP16) for LLM inference?

Seeing the datasheet for both GPUS, the H100 has twice the max flops, but they have almost the same memory bandwidth (2000 GB/sec). As memory latency dominates inference, I wonder what benefits the H100 has. One benefit could, of course, be the ability to use FP8 (which is extremely useful), but I'm interested in the difference in the hardware specs in this question.

you are viewing a single comment's thread
view the rest of the comments
[–] I_will_delete_myself@alien.top 1 points 10 months ago (1 children)

They have around the same amount of cuda cores. Normally the bigger the cuda cores the higher the inference

[–] RobbinDeBank@alien.top 1 points 10 months ago

More tensor and cuda cores mean higher inference and training speed right? Do inference and training get the same benefit from those cores?