overview for vec1nu

Using Mistral Openorca to create a knowledge graph from a text document in c/localllama@poweruser.forum

[–] vec1nu@alien.top 1 points 2 years ago

This is a really good question and i'd also like to understand how to use the knowledge base with an LLM

Extract Tables from PDFs in c/localllama@poweruser.forum

[–] vec1nu@alien.top 1 points 2 years ago

I've had good results using https://github.com/DevashishPrasad/CascadeTabNet

Confining LLaMA 2's context for RAG QA in c/localllama@poweruser.forum

[–] vec1nu@alien.top 1 points 2 years ago

Use something like lmql, guidance or guiderails to get the model to say it doesn't know. I've also had some success with the airoboros fine-tuned models, which have this behaviour defined in the dataset using a specific prompt.

cant get cuda to work with llama-cpp-python in wsl ubuntu. in c/localllama@poweruser.forum

[–] vec1nu@alien.top 1 points 2 years ago

I think you don't have cuda properly setup. Use pip install --verbose to see the compilation messages when it's trying to build llamacpp with cuda. You might need to manually set the CUDA_HOME environment variable.

load llama-2 in 8b quantization? in c/localllama@poweruser.forum

[–] vec1nu@alien.top 1 points 2 years ago (1 children)

I haven't used gptq in a while, but i can say that gguf has 8 bit quantization, which you can use with llamacpp. Furthermore, if you use the original huggingface models, the ones which you load using the transformers loader, you have options in there to load in either 8 or 4bit.

MonadGPT, an early modern chatbot trained on Mistral-Hermes and 17th century books. in c/localllama@poweruser.forum

[–] vec1nu@alien.top 1 points 2 years ago

Which frontend is that?