LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Best Local LLM Backend Server Library? (alien.top)

submitted 2 years ago by BayesMind@alien.top to c/localllama@poweruser.forum

3 comments fedilink hide all child comments

I maintain the uniteai project, and have implemented a custom backend for serving transformers-compatible LLMs. (That file's actually a great ultra-light-weight server if transformers satisfies your needs; one clean file).

I'd like to add GGML etc, and I haven't reached for cTransformers. Instead of building a bespoke server, it'd be nice if a standard was starting to emerge.

For instance, many models have custom instruct templates, which, if a backend handles all that for me, that'd be nice.

I've used llama.cpp, but I'm not aware of it handling instruct templates. Is that worth building on top of? It's not too llama-only focused? Production worthy? (it bills itself as "mainly for educational purposes").

I've considered oobabooga, but I would just like a best-in-class server, without all the other FE fixings and dependencies.

Is OpenAI's API signature something people are trying to build against as a standard?

Any recommendations?

top 3 comments

sorted by: hot top controversial new old

[–] noobgolang@alien.top 1 points 2 years ago (1 children)

Disclosure : I’m the maintainer of nitro project

We have a simple llama server with just single binary that you can download try right away here https://github.com/janhq/nitro it will be a viable option if you want to set up an openai compatible endpoint to test out new model

[–] noobgolang@alien.top 1 points 2 years ago

https://nitro.jan.ai/

[–] KeyAdvanced1032@alien.top 1 points 2 years ago

I think all frameworks support custom instruct templates, and know for a fact llama.cpp does due to my use of StudioLM, based on llama.cpp, in which I can alter the system / user / assistant templates.