this post was submitted on 30 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

I dont have budget for hosting models on dedicated GPU, what are the alternative options or platforms that let me use Opensource models like mistral, Llamas, etc in a pay per API call basis ?

top 7 comments
sorted by: hot top controversial new old
[–] pictoria_dev@alien.top 1 points 9 months ago

I'm currently exploring different models too, in particular for coding. Tried deepseek-coder on their official website and it was good. Unfortunately they collect chat data. Anyone know of a pay-as-you-go services that offers this model?

[–] theodormarcu@alien.top 1 points 9 months ago

What's the use case? Chatting with them, or for your own apps?

Check out open router too

[–] teddybear082@alien.top 1 points 9 months ago

Openrouter and some of the models hosted are free.

Google colab but depends on how long google will let you use it for free (you can also pay monthly)

[–] TradingDreams@alien.top 1 points 9 months ago

It may be out of your range, but you can pick up the dell precision 7720 with a 16gb P5000 GPU for about $500 on eBay. The Quadro P5000 is also in a few other workstation laptop models around that era. Note: They had other graphics options so only go for P5000 models.

[–] DarthNebo@alien.top 1 points 9 months ago

HuggingFace has inference endpoint which is private & public as needed with sleep built in

[–] sbashe@alien.top 1 points 9 months ago

https://www.anyscale.com/endpoints#hosted Good service. I use all it the time. Also has fine-tuning options if u need.

[–] ThisGonBHard@alien.top 1 points 9 months ago

One I used before is runpod.io, but it is a pay per time platform, not API.