Machine Learning

1 readers

1 users here now

Community Rules:

Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.

founded 2 years ago

MODERATORS

communick@academy.garden

[D] Is there any research for RAG where vectors that are queried in succession become associated for future queries, even if their embedding values are different? (alien.top)

submitted 2 years ago by 30299578815310@alien.top to c/machinelearning@academy.garden

1 comments fedilink hide all child comments

A common situation in IRL problems with long time horizons is the need to perform multiple very different subtasks. For example, imagine a model trained to remember a poem and then spell it out in blocks in a game of minecraft. The data for the poem itself and the appropriate minecraft functions probably have very different embeddings, but in practice it would probably be useful to ensure the memories for how to use minecraft functions are queried when that poem is queried.

It seems like just querying a RAG DB for the vectors with the highest cosine similarity won't be super useful for this task. A query for poems will just find poem-like data. But we don't just want to find things with similar embeddings to poems, we want to find data that is useful. Has there been any research into this time-series / associative type of RAG?

top 1 comments

sorted by: hot top controversial new old

[–] saintshing@alien.top 1 points 2 years ago

I am not sure if I understand your question.

What exactly is your query? Is it "spell the poem in blocks"? Or you really want a "poem" query to always return also the part about spelling in minecraft blocks even though you haven't mentioned anything about minecraft blocks?

These two things are not associated together in human languages. I meant you can create training data to force them to be embedded together if you want. You can also add a layer on top of the vector db, so some metadata is stored together with the embedding which can help you retrieve related documents.