samme013

joined 10 months ago
[–] samme013@alien.top 1 points 10 months ago

Couple of key considerations:

Do you always know the "domain" a given query is related to?
Are there cases where documents outside of the domain of the query could be useful?

If you always know and always only care about documents in the domain then I would use a hard filter. If either is fuzzy I would test it out with and without filters and see how that goes. A good embedding model should be able to match only relevant topics without hard filters but depending on the data adding hard filters could be worth it. Make a representative list of queries you might encounter and check the documents being returned.