overview for ButlerFish

Cheapest way to run local LLMs? in c/localllama@poweruser.forum

[–] ButlerFish@alien.top 1 points 2 years ago

You get charged while to pod is running, and the pod is running until you turn it off on the runpod control panel even if you aren't actually doing anything on there right now.

If you added a volume (cloud hard drive) when you created it then, even when it is turned off, you are paying 10 cents / gigabyte / month to rent that hard drive so your data is still there when you turn it on again.

For niche usecases where it needs to be available but isn't running stuff most of the time like that home assistant I mentioned, look at runpod serverless which is much more fiddly and hard to use but will let you pay essentially per prompt... for playing with LLMs and interacting it's much better to just rent a server and turn it off when you are done.

Cheapest way to run local LLMs? in c/localllama@poweruser.forum

[–] ButlerFish@alien.top 1 points 2 years ago (2 children)

What I do is, sign up to run pod and buy $10 of credit, then go to the "templates" section and use it to make a cloud VM pre-loaded with the software to run LLMs. One of their 'official' templates called " RunPod TheBloke LLMs" should be good. I usually use the A100 pod type, but you can get bigger or smaller / faster or cheaper.

Depending on the Readme for the template you can click Connect to Jupyter and run the notebook that came with the template to start services, download your model from huggingface or whatever. This is fine for experimenting with LLMs.

If what you had planned was some kind of home project like building your own home assistant then you have a bunch of other problems to solve like how to do that cheaply, trigger words and TTS/STT. You might use the serverless or spot instance functionality Runpod has and figure out the smallest pod / LLM that works for your use. You'd probably do the microphone and triggerword stuff on your Pi and have it connect to the runpod server to run the TTS/STT and LLM bits.

Cheapest way to run local LLMs? in c/localllama@poweruser.forum

[–] ButlerFish@alien.top 1 points 2 years ago (4 children)

If you want to run the models posted here, and don't care so much about physical control of the hardware they are running on, then you can use various 'cloud' options - runpod and vast are straight forward and cost about 50 cents an hour for a decent system.

Is DeepSeek Coder 1.3b meant to be this bad? in c/localllama@poweruser.forum

[–] ButlerFish@alien.top 1 points 2 years ago

Looks like a very small model. Maybe better for a code completion usecase.

Details emerge of surprise board coup that ousted CEO Sam Altman at OpenAI (Microsoft CEO Nadella "furious"; OpenAI President and three senior researchers resign) in c/localllama@poweruser.forum

[–] ButlerFish@alien.top 1 points 2 years ago

I think a big part of the enthusiasm for AI comes from Microsoft's deeply and wide lobbying abilities. It would be fascinating to watch them back that out and try and pivot to a new new thing.