LocalLLaMA

3 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago

MODERATORS

communick@poweruser.forum

Cheapest site for hosting custom LLM models? (alien.top)

submitted 11 months ago by StrangeImagination5@alien.top to c/localllama@poweruser.forum

7 comments fedilink hide all child comments

I'm currently trying to figure out where it is the cheapest to host these models and use them.

I realized that a lot of the finetunings are not available on common llm api sites, i want to use nous capybara 34b for example but the only one that offered that charged 20$/million tokens which seemed quite high, considering that i see Lama 70b for around 0.7$/million tokens.

So are there any sites where i could host custom finetunes and get similar rates to the one mentioned?

you are viewing a single comment's thread
view the rest of the comments

[–] AntoItaly@alien.top 1 points 11 months ago (2 children)

Replicate $0.000575/sec for a Nvidia A40 (48GB Vram)

[–] yahma@alien.top 1 points 11 months ago

The startup time makes Replicate nearly unusable for me. Only popular models stay in memory. Other less used models shutdown, and you need to wait for startup before first inference.

load more comments (1 replies)