Key-Comparison3261

joined 11 months ago
[–] Key-Comparison3261@alien.top 1 points 11 months ago

You have exllama, vllm, lmdeploy in python. And in most cases fastapi is used for serving an http endpoint.

I wrote llm-sharp just for dropping python (GIL, pip deps) and getting flexible adaptation to dynamic model structures apart from standard llama.

 

K024/llm-sharp: Language models in C# (github.com)

I've recently drafted this. But adding more models & features & tests & documentation will just cost too much time. Seeking for comments & colaborators.