this post was submitted on 09 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

I have a Ryzen 5 with 12 cores. When I monitor the CPU during inference, many of them spike to 100%, but many do not. It looks like there is a lot more juice here than what is being squeezed out of the processor. Any gurus have any insight into how the models or underlying libraries decide how to allocate the CPU resources? I'd like all 10 cores to be at 100% the whole time with 2 to handle the minimal system requirements. You know?

you are viewing a single comment's thread
view the rest of the comments
[–] nmkd@alien.top 1 points 1 year ago

Which backend are you talking about