asakura_matsunoki

joined 10 months ago
 

I am looking for ways to have an open-source reranker like bge-rerank inside my RetrievalQA chain, but have not find examples of doing this. Is it possible at the moment?

 

So I want to ask for advice on 2 related topics:

  1. If I have a corpus of many documents embedded in a vector store, how can I dynamically select (by metadata, for example) a subset of them and only perform retrieval on that subset for answer generation.

  2. I want LLaMa to be able to say I DO NOT KNOW if the context it retrieved cannot answer the question. This behavior is not stable yet from what I have seen.

Thank you so much!

 

Hi,

So I learning to build RAG system with LLaMa 2 and local embeddings. I have this big csv of data on books. Each row is a book and the columns are author(s), genres, publisher(s), release dates, ratings, and then one column is the brief summaries of the books.

I am trying to build an agent to answer questions on this csv. From basic lookups like

'what books were published in the last two years?',

'give me 10 books from this publisher ABC with a rating higher then 3'

to more meaningful queries that need to read into the free-text summary column like:

'what books have a girl as the main character?'

'what books feature dragons? compare their plots'

I believe I got the general framework, but when I tried running it I got into a token limit error. Seems like the file is too big to be digested. Would love to hear your advice on any strategies to overcome this? I though about chunking but then how to recombine the answers from each chunk is unclear to me.

Thanks a ton! Cheers :D