LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

PSA: If you have Telsa P40 which has abysmal FP16 performance DO NOT update oobabooga past commit 564d0cde8289a9c9602b4d6a2e970659492ad135 (alien.top)

submitted 2 years ago by nero10578@alien.top to c/localllama@poweruser.forum

1 comments fedilink hide all child comments

I updated to the latest commit because ooba said it uses the latest llama.cpp that improved performance. What I suspect happened is it uses more FP16 now because the tokens/s on my Tesla P40 got halved along with the power consumption and memory controller load.

You can fix this by doing:

git reset --hard 564d0cde8289a9c9602b4d6a2e970659492ad135

to go back to the last verified commit that didn't kill performance on the Tesla P40. Not sure how to fix this for future updates so maybe u/Oobabooga can chime in.

top 1 comments

sorted by: hot top controversial new old

[–] opi098514@alien.top 1 points 2 years ago

How many tokens/s do you get with the p40. I’ve been contemplating getting one and using it along side my 3060 12 gig.