Comment by traverseda
Comment by traverseda 2 days ago
I don't understand why problems like this aren't solved by vector similarity search. Indiana Jones lives in a particular part of vector space.
Two close to one of the licensed properties you care to censor the generation of? Push that vector around. Honestly detecting whether a given sentence is a thinly veiled reference to indiana jones seems to be exactly the kind of thing AI vector search is going to be good at.
Thinking of it in terms of vector similarity does seem appropriate, and then definition of similarity suddenly comes into debate: If you don't get Harrison Ford, but a different well-known actor along with everything else Indiana-Jones, what is that? Do you flatten the vector similarity matrix to a single infringement-scale?