overview for ClassroomGold6910

Cheapest way to run local LLMs? in c/localllama@poweruser.forum

[–] ClassroomGold6910@alien.top 1 points 11 months ago (2 children)

What's the difference between `K_M` models, also why is `Q_4` legacy but not `Q_4_1`, it would be great if someone could explain that lol

Cheapest way to run local LLMs? in c/localllama@poweruser.forum

[–] ClassroomGold6910@alien.top 1 points 11 months ago

3b's work amazingly and super smoothly but 7b models while running at a fair 15 tokens per second prevent me from using any other application at the same time and occasionally freeze my mouse and screen temporarily until the response is finished

Cheapest way to run local LLMs? in c/localllama@poweruser.forum

[–] ClassroomGold6910@alien.top 1 points 11 months ago

20 tok/s seems like the minimum I would be sane with lol

1

Cheapest way to run local LLMs? (alien.top)

submitted 11 months ago by ClassroomGold6910@alien.top to c/localllama@poweruser.forum

13 comments fedilink

Not super knowledgeable about all the different specs of the different Orange PI and Rasberry PI models. I'm looking for something relatively cheap that can connect to WiFi and USB. I want to be able to run at least 13b models at a a decent tok / s.

Also open to other solutions. I have a Mac M1 (8gb RAM) and upgrading the computer itself would be cost prohibitive for me.