LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Why is a single a100 so slow? (alien.top)

submitted 2 years ago by Radiant-Practice-270@alien.top to c/localllama@poweruser.forum

8 comments fedilink hide all child comments

I’m using a100 pcie 80g. Cuda11.8 toolkit 525.x

But when i inference codellama 13b with oobabooga(web ui)

It just make 5tokens/s

It is so slow.

Is there any config or something else for a100???

you are viewing a single comment's thread
view the rest of the comments

[–] hudimudi@alien.top 1 points 2 years ago

Uhmmm where did you buy that a100? Was it a good deal? lol. Just kidding, you probably set sth up wrong or the drivers are messing up. Is the card working fine otherwise in benchmarks?

permalink
fedilink
source