this post was submitted on 30 Oct 2023
1 points (100.0% liked)

Machine Learning

1 readers
1 users here now

Community Rules:

founded 11 months ago
MODERATORS
 

Hi everyone,

Need some ideas to bounce off.

I have several medical codes, let’s name them A, B, C and D.

Each medical code consists of multiple clauses, say, 1.1, 1.2 and so on.

I want to create a model (?) where a text input of a textual clause will show up all other related clauses from different medical codes. For example, if I input clause 3.2 from medical A, I want the output to show up the related/similar clauses from code B, C and D.

I have thought of using something like a Retrieval Augmented Generation for this, but anyone has any better ideas regarding this topic? Could a language model do something about this? Thanks!

you are viewing a single comment's thread
view the rest of the comments
[–] Mammoth-Doughnut-160@alien.top 1 points 10 months ago

The section references without text associated with them are still a very hard problem to solve with RAG unfortunately and there is no magic bullet for that yet. The closest may be a knowledge graph but that presupposes the sections referenced show up with some frequency as well (in a large corpus a single link won’t be really visible). I have been looking at a lot of legal contracts and have similar issue.

The best solution still by far is RAG. Check out this GitHub repo that has the most easy to use integrated RAG with great hybrid searching and fact checking that is used a lot for legal documents: https://github.com/llmware-ai