Comment by romanhn

Comment by romanhn 6 months ago

Say I generate embeddings for a bunch of articles. Given the query "articles about San Francisco that don't mention cars" would cosine similarity uprank or downrank the car mentions? Assuming exclusions aren't handled well, what techniques might I use to support them?

stared 6 months ago

It is up for testing, but you likely get the effect of "don't think about a pink elephant." So I guess that for most embedding models, "articles about San Francisco that don't mention cars" are closest to articles about SF that mention cars.

The fundamental issue here is comparing apples to oranges, questions, and answers.

Reply View 1 reply

romanhn 6 months ago

So is LLM pre/post-processing the best approach here?

Reply View | 0 replies

mirekrusin 6 months ago

I think you have to separate it into negative query and run (negative) rank and combine results yourself.

Reply View 0 replies

breadislove 6 months ago

no this wont work. embedding models at the moment are pretty bad with negations

Reply View 0 replies