plsendfast

joined 11 months ago
[–] plsendfast@alien.top 1 points 10 months ago

Wow, what an innovative paper!! Try applying to MIT?

 

Hi everyone, just thought I can write a post here to bounce ideas off.

I have multiple clauses, and I want to extract out the object of interest within that clause. For example, this is a clause: " Exit staircases shall be constructed of non-combustible materials to comply with the provisions of Cl.3.10.1.". Obviously, the object of interest here is 'exit staircases' or 'staircases', so I want that to be extracted. Here is another clause: "No structure or building shall be constructed within a sewer.". Now, there are multiple object in that clause (i.e structure, building, and sewer), but it is also obvious that the object of interest is referring to 'sewer'.

I ran through this in GPT-3.5, and it works. The GPT is able to return me the object of interest pretty accurately. However, is it possible to mass generate the response from GPT based on my huge list of clauses, instead of inputting the prompt very often? How do I do that? For example, I have a list of clauses, how can I make use of LLM such that I can get back the object of interest of each particular clause without prompting it manually?

Also, is this the correct/ideal way to extract out the object of interest for a huge list of clauses? My final goal is to cluster those similar object of interest, and see what clauses those object of interest are linked to (via some kind of RAG approach). So I create some kind of knowledge graph from that. Do you think my method is the right approach?

Thanks!

 

Hi everyone,

Need some ideas to bounce off.

I have several medical codes, let’s name them A, B, C and D.

Each medical code consists of multiple clauses, say, 1.1, 1.2 and so on.

I want to create a model (?) where a text input of a textual clause will show up all other related clauses from different medical codes. For example, if I input clause 3.2 from medical A, I want the output to show up the related/similar clauses from code B, C and D.

I have thought of using something like a Retrieval Augmented Generation for this, but anyone has any better ideas regarding this topic? Could a language model do something about this? Thanks!