LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

Using Mistral Openorca to create a knowledge graph from a text document (towardsdatascience.com)

submitted 2 years ago by WaterdanceAC@alien.top to c/localllama@poweruser.forum

23 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] Distinct-Target7503@alien.top 1 points 2 years ago (6 children)

That's really interesting, thank for sharing!!

How do the querying process work for this 'knowledge graph"?

[–] laca_komputilulo@alien.top 1 points 2 years ago (3 children)

Finally, a question on this sub that is not about an "AI girlfriend" (ahem RP)

There are about a dozen + different ways to incorporate KGs into an LLM workflow with our without RAG. Some examples:

## Analyze user question, map it into KG nodes and extract connectivity links between them. Then put that info into the LLM prompt to better guide the answer.

Example: "Who is Mary Lee Pfeiffer's son and what is he known for"? (b.t.w. try this on ChatGPT 3.5)

KG contribution -- resolve Mary Lee Pfeiffer, use "gave-birth-to" edge / link to resolve Tom Cruise
Add this info to the user prompt, have LLM complete the rest of the background info, like movies appeared in, etc.

## Use KG for better RAG relevancy.

Example: Assume your KG is not about concepts but simply links paragraphs/chunks together. This could be simple as mining links like (see Paragraph X for more detail), Doing semantic similarity between chunks, putting in structural info like (chunk is part of Chapter X, Page Y), topic or concept -based connectivity between chunks.

Then, given a user query, find the most relevant starting chunk, Apply logic for what is "more relevant" from your application to figure out which other linked chunks to pull into the context. One simple hack, using node centrality or Personalized PageRank is to pull in chunks that are indirectly connected, but have high prominence in the graph

[–] Some_Endian_FP17@alien.top 1 points 2 years ago (1 children)

Thanks for this. I've only worked with RAG on OpenAI models and there's a lot of prompt finetuning needed to get decent results. A KG helps define the semantic elements and relationships between document fragments and the user query for RAG.

That said, I'm still relying on the vector database to do most of the heavy lifting of filtering relevant results before feeding them into an LLM. Having an LLM clean up or summarize the user query and create a KG from the vector database's response could lead to more accurate answers.

[–] laca_komputilulo@alien.top 1 points 2 years ago

Having an LLM clean up or summarize the user query and create a KG from the vector database's response could lead to more accurate answers.

That is the promise. Of course, you still need to figure out for your app domain if doing a concept-level, chunk level, or some in-between option like CSKG is the right application.

One thing I find helpful with prompt design is to spend less attention on writing instructions, replacing them with specific examples instead. This replaces word-smithing with in-context learning samples. You build up the examples iteratively, running the same prompt through more text, fixing it and adding onto the example list.... until you reach your context budget for the system prompt.

load more comments (1 replies)

load more comments (3 replies)