Comment by traverseda

Comment by traverseda 2 days ago

3 replies

I don't understand why problems like this aren't solved by vector similarity search. Indiana Jones lives in a particular part of vector space.

Two close to one of the licensed properties you care to censor the generation of? Push that vector around. Honestly detecting whether a given sentence is a thinly veiled reference to indiana jones seems to be exactly the kind of thing AI vector search is going to be good at.

genericone 2 days ago

Thinking of it in terms of vector similarity does seem appropriate, and then definition of similarity suddenly comes into debate: If you don't get Harrison Ford, but a different well-known actor along with everything else Indiana-Jones, what is that? Do you flatten the vector similarity matrix to a single infringement-scale?

htrp 2 days ago

Not worth it to compute the embedding for Indy and a "bull-whip archaeologist" most guardrails operate at the input level it seems?

  • gavmor 2 days ago

    > Not worth it to compute the embedding for Indy

    If IP holders submit embeddings for their IP, how can image generators "warp" the latent space around a set of embeddings so that future inferences slide around and avoid them--not perfectly, or literally, but as a function of distance, say, following a power curve?

    Maybe by "Finding non-linear RBF paths in GAN latent space"[0] to create smooth detours around protected regions.

    0. https://openaccess.thecvf.com/content/ICCV2021/papers/Tzelep...