LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Is LocalLLaMA on RunPod cheaper than Chat GPT4 for text prompts? (alien.top)

submitted 2 years ago by allun11@alien.top to c/localllama@poweruser.forum

4 comments fedilink hide all child comments

I have a query which costs around 300 tokens, and as 1000 tokens cost 0,06 USD that translates to roughly 0,02 USD for that request.

Let say I would deploy a LocalLLaMA on RunPod, on one of the cheaper machines, would that request be cheaper than running it on GPT4?

you are viewing a single comment's thread
view the rest of the comments

[–] FairSum@alien.top 1 points 2 years ago

If you're looking at cloud / API services, the best option is probably something like either TogetherAI or DeepInfra. TogetherAI tops out at 0.0009 / 1K for 70B models and DeepInfra tops out at 0.0007 / 1K input and 0.00095 output for 70B models. Both of those are well below Turbo and GPT4 price levels. Big caveat being this will only work if the model you want to use is up there. If it isn't and you want to deploy / use said model, RunPod is probably the "cheapest" option, but it charges money as long as the pod is active, and it'll burn through money very quickly. In that case, RunPod likely won't be much, if any, cheaper than using GPT4.