overview for davidmezzetti

1

Build RAG pipelines with txtai (neuml.hashnode.dev)

submitted 2 years ago by davidmezzetti@alien.top to c/localllama@poweruser.forum

0 comments fedilink

What is considered the best uncensored LLM right now? in c/localllama@poweruser.forum

[–] davidmezzetti@alien.top 1 points 2 years ago

I haven't found one that is universally best regardless of the benchmarks. Same story with vector embeddings, you'll need to test a few out for your own use case.

The best one I've found for my projects though is https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca and the AWQ implementation https://huggingface.co/TheBloke/Mistral-7B-OpenOrca-AWQ.

RAG in a couple lines of code with txtai-wikipedia embeddings database + Mistral in c/localllama@poweruser.forum

[–] davidmezzetti@alien.top 1 points 2 years ago

Yes, if you build an embeddings database with your documents. There are a ton of examples available: https://github.com/neuml/txtai

RAG in a couple lines of code with txtai-wikipedia embeddings database + Mistral in c/localllama@poweruser.forum

[–] davidmezzetti@alien.top 1 points 2 years ago

It works with GPTQ models as well, just need to install AutoGPTQ.

You would need to replace the LLM pipeline with llama.cpp for it to work with GGUF models.

See this page for more: https://huggingface.co/docs/transformers/main_classes/quantization

RAG in a couple lines of code with txtai-wikipedia embeddings database + Mistral in c/localllama@poweruser.forum

[–] davidmezzetti@alien.top 1 points 2 years ago

Thank you, appreciate it.

I have a company (NeuML) in which I provide paid consulting services through.

RAG in a couple lines of code with txtai-wikipedia embeddings database + Mistral in c/localllama@poweruser.forum

[–] davidmezzetti@alien.top 1 points 2 years ago

Well for RAG, the GitHub repo and it's documentation would need to be added to the Embeddings index. Then probably would want a code focused Mistral finetune.

I've been meaning to write an example notebook that does this for the txtai GitHub report and documentation. I'll share that back when it's available.

RAG in a couple lines of code with txtai-wikipedia embeddings database + Mistral in c/localllama@poweruser.forum

[–] davidmezzetti@alien.top 1 points 2 years ago

This code uses txtai, the txtai-wikipedia embeddings database and Mistral-7B-OpenOrca-AWQ to build a RAG pipeline in a couple lines of code.

1

RAG in a couple lines of code with txtai-wikipedia embeddings database + Mistral (alien.top)

submitted 2 years ago by davidmezzetti@alien.top to c/localllama@poweruser.forum

15 comments fedilink

txtai 6.2 released: Adds binary quantization, bind parameters for multimedia SQL queries and performance improvements in c/localllama@poweruser.forum

[–] davidmezzetti@alien.top 1 points 2 years ago

Thank you, glad to hear it.

1

txtai 6.2 released: Adds binary quantization, bind parameters for multimedia SQL queries and performance improvements (github.com)

submitted 2 years ago by davidmezzetti@alien.top to c/localllama@poweruser.forum

3 comments fedilink

Create your own local RAG API Service in c/localllama@poweruser.forum

[–] davidmezzetti@alien.top 1 points 2 years ago

This is an application that connects a vector database and LLM to perform RAG. The logic is written in Python and available as a local API service.

1

Create your own local RAG API Service (neuml.hashnode.dev)

submitted 2 years ago by davidmezzetti@alien.top to c/localllama@poweruser.forum

2 comments fedilink