LocalLLaMA

3 readers

1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago

MODERATORS

communick@poweruser.forum

TabbyAPI released! A pure LLM API for exllama v2. (github.com)

submitted 11 months ago by panchovix@alien.top to c/localllama@poweruser.forum

6 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] a_beautiful_rhind@alien.top 1 points 11 months ago (3 children)

Nice. A lightweight loader. Will make us free of gradio.

[–] oobabooga4@alien.top 1 points 11 months ago (2 children)

Gradio is a 70MB requirement FYI. It has become common to see people calling text-generation-webui "bloated", when most of the installation size is in fact due to Pytorch and the CUDA runtime libraries.

https://preview.redd.it/pgfsdld7xw0c1.png?width=370&format=png&auto=webp&s=c50a14804350a1391d57d0feac8a32a5dcf36f68

[–] kpodkanowicz@alien.top 1 points 11 months ago

I think there is room for everyone - Text Gen is a piece of art - it's the only thing in the whole space that always works and is reliable. However, if im building an agent and getting a docker build, I can not afford to change text gen etc.

[–] tronathan@alien.top 1 points 11 months ago

Gradio is a 70MB requirement

That doesn't make it fast, just small. Inefficient code can be compact.