LocalLLaMA

3 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago

MODERATORS

communick@poweruser.forum

Best Local LLM Backend Server Library? (alien.top)

submitted 11 months ago by BayesMind@alien.top to c/localllama@poweruser.forum

3 comments fedilink hide all child comments

I maintain the uniteai project, and have implemented a custom backend for serving transformers-compatible LLMs. (That file's actually a great ultra-light-weight server if transformers satisfies your needs; one clean file).

I'd like to add GGML etc, and I haven't reached for cTransformers. Instead of building a bespoke server, it'd be nice if a standard was starting to emerge.

For instance, many models have custom instruct templates, which, if a backend handles all that for me, that'd be nice.

I've used llama.cpp, but I'm not aware of it handling instruct templates. Is that worth building on top of? It's not too llama-only focused? Production worthy? (it bills itself as "mainly for educational purposes").

I've considered oobabooga, but I would just like a best-in-class server, without all the other FE fixings and dependencies.

Is OpenAI's API signature something people are trying to build against as a standard?

Any recommendations?

you are viewing a single comment's thread
view the rest of the comments

[–] KeyAdvanced1032@alien.top 1 points 11 months ago

I think all frameworks support custom instruct templates, and know for a fact llama.cpp does due to my use of StudioLM, based on llama.cpp, in which I can alter the system / user / assistant templates.