Machine Learning

1 readers

1 users here now

Community Rules:

Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.

founded 2 years ago

MODERATORS

communick@academy.garden

[D] Why choose an H100 over an A100 for LLM inference? (alien.top)

submitted 2 years ago by faschu@alien.top to c/machinelearning@academy.garden

14 comments fedilink hide all child comments

What are the benefits of using an H100 over an A100 (both at 80 GB and both using FP16) for LLM inference?

Seeing the datasheet for both GPUS, the H100 has twice the max flops, but they have almost the same memory bandwidth (2000 GB/sec). As memory latency dominates inference, I wonder what benefits the H100 has. One benefit could, of course, be the ability to use FP8 (which is extremely useful), but I'm interested in the difference in the hardware specs in this question.

you are viewing a single comment's thread
view the rest of the comments

[–] SnooHesitations8849@alien.top 1 points 2 years ago

H100 and A100 are best for training. H100 is optimized for lower precision (8/16 bits) and optimized for transformer. A100 is still very good but not that much. A100 is still very GPU-like. Wwhile H100 is a transformer-accelerator.

Using them for inference is not the best econ-friendly though.