LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Looking for open-source contributors for text-embedding server for inference (alien.top)

submitted 2 years ago by OrganicMesh@alien.top to c/localllama@poweruser.forum

5 comments fedilink hide all child comments

Around 1.5 months ago, I started https://github.com/michaelfeil/infinity. With the hype in Retrieval-Augmented-Generation, this topic got important over the last month in my view. With this Repo being the only option under a open license.

I now implemented everything from faster attention, onnx / ctranslate2 / torch inference, caching, better docker images, better queueing stategies. Now I am pretty much running out of ideas - if you got some, feel free to open an issue, would be very welcome!

you are viewing a single comment's thread
view the rest of the comments

[–] SlowSmarts@alien.top 1 points 2 years ago

Looks very interesting!

Will this work on a pre-AVX CPU only machine? ( I happen to be far away from a computer right now to test)