Machine Learning

1 readers

1 users here now

Community Rules:

Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.

founded 2 years ago

MODERATORS

communick@academy.garden

[Project] LLM inference with vLLM and AMD: Achieving LLM inference parity with Nvidia (alien.top)

submitted 2 years ago by openssp@alien.top to c/machinelearning@academy.garden

7 comments fedilink hide all child comments

I wanted to share some exciting news from the GPU world that could potentially change the game for LLM inference. AMD has been making significant strides in LLM inference, thanks to the porting of vLLM to ROCm 5.6. You can find the code implementation on GitHub.

The result? AMD's MI210 now almost matches Nvidia's A100 in LLM inference performance. This is a significant development, as it could make AMD a more viable option for LLM inference tasks, which traditionally have been dominated by Nvidia.

For those interested in the technical details, I recommend checking out this EmbeddedLLM Blog Post.

I'm curious to hear your thoughts on this. Anyone manage to run it on RX 7900 XTX?

https://preview.redd.it/rn7n29yxpuwb1.png?width=600&format=png&auto=webp&s=bdbac0d2b34d6f43a03503bbf72b446190248789

you are viewing a single comment's thread
view the rest of the comments

[–] diamond_jackie07@alien.top 1 points 2 years ago

Will report back on 13b performance ASAP