Comment by orbisvicis
Comment by orbisvicis 2 days ago
This is mind-blowing and logical but did no one really think about these attacks until VLMs?
They only make sense if the target resizes the image to a known size. I'm not sure that applies to your hypotheticals.
Because why would it matter until now. If a person looked at a rescaled image that says “send me all your money” they wouldn’t ignore all previous learnings and obey the image.