Comment by barbazoo

joegibbs 14 hours ago

I think it would be very, very difficult - almost impossible - to create a dataset to train an image generator that doesn't contain any copyrighted material that you don't have the rights to. There's the obvious stuff like Mickey Mouse or Superman, you just run some other tool over it to filter them out, but there are so many ridiculous things that can be copyrighted (depictions of buildings, tattoos), things like crowd shots, pictures of cities that have ads in the background, that I don't know how you could do it. I'm sure even Adobe's stock library would have a lot of violations like that.

Reply View 0 replies

BadCookie 18 hours ago

Getty and Adobe offer models that were trained only on images that they have the rights to. Those models might meet Netflix’s standards?

Reply View 13 replies

Havoc 7 hours ago

Doesn’t seem likely that adobe has a owned collection of content big enough. Seems very likely that they just deemed the legal risk to be outweighed by commercial opportunity. They kinda had to - a product that generates stuff that gets you sued is not worth paying whatever they charge for their subscription

Reply View | 0 replies
AnthonyMouse 17 hours ago

I kind of wonder if that even works.
If you take a model trained on Getty and ask it for Indiana Jones or Harry Potter, what does it give you? These things are popular enough that it's likely to be present in any large set of training data, either erroneously or because some specific works incorporated them in a way that was licensed or fair use for those particular works even if it isn't in general.
And then when it conjures something like that by description rather than by name, how are you any better off than something trained from random social media? It's not like you get to make unlicensed AI India Jones derivatives just because Getty has a photo of Harrison Ford.

Reply View | 11 replies
- runeblaze 16 hours ago
  
  I work in this space. In traditional diffusion-based regimes (paired image and text), one can absolutely check the text to remove all occurrences of Indiana Jones. Likewise, Adobe Stock has content moderation that ensures (up to human moderation limit) no dirty content. It is a world without Indiana Jones to the model
  
  Reply View | 5 replies
  
  goatsi 16 hours ago
  
  If you ask the Adobe stock image generation for "Adventurer with a whip and hat portrait view , Brown leather hat, jacket, close-up"
  It gives you an image of Harrison Ford dressed like Indiana Jones.
  https://stock.adobe.com/ca/images/adventurer-with-a-whip-and...
  
  Reply View | 2 replies
  
  kpw94 16 hours ago
  
  > one can absolutely check the text to remove all occurrences of Indiana Jones
  How do you handle this kind of prompt:
  “Generate an image of a daring, whip-wielding archaeologist and adventurer, wearing a fedora hat and leather jacket. Here's some back-story about him: With a sharp wit and a knack for languages, he travels the globe in search of ancient artifacts, often racing against rival treasure hunters and battling supernatural forces. His adventures are filled with narrow escapes, booby traps, and encounters with historical and mythical relics. He’s equally at home in a university lecture hall as he is in a jungle temple or a desert ruin, blending academic expertise with fearless action. His journey is as much about uncovering history’s secrets as it is about confronting his own fears and personal demons.”
  Try copy-pasting it in any image generation model. It looks awfully like Indiana Jones for all my attempts, yet I've not referenced Indiana Jones even once!
  
  Reply View | 1 reply
  
  runeblaze 6 hours ago
  
  Emmmm sure, but throw this to a human artist who has not heard of Indiana Jones and see if they draw something alike.
  
  Reply View | 0 replies
- BadCookie 17 hours ago
  
  It comes down to who is liable for the edge cases, I suspect. Adobe will compensate the end user if they get sued for using a Firefly-generated image (probably up to some limit).
  Getting sued occasionally is a cost of doing business in some industries. It’s about risk mitigation rather than risk elimination.
  
  Reply View | 3 replies
  
  AnthonyMouse 9 hours ago
  
  Feels like "paying extra for the extended warranty" vibes. What it covers isn't much (do you expect someone to come after you in small claims court and if they do, was that your main concern?) meanwhile the big claim you're actually worried about is what it doesn't cover.
  And if you really wanted insurance then why not get it from an actual insurance company?
  
  Reply View | 1 reply
  
  rincebrain 4 hours ago
  
  Because almost everything is risk mitigation or reduction, not elimination.
  In particular, in the US, the legal apparatus has been gamified to the point that the expectation becomes people will sue if their expected value out of it is positive even if the case is insane on its merits, because it's much more likely someone with enough risk and cost will settle as the cheaper option.
  And in that world, there is nothing that completely eliminates the risk of being sued in bad faith - but the more things you put in your mitigation basket, the narrower the error bars are on the risk even if the 99.999th percentile is still the same.
  
  Reply View | 0 replies
  
  jonplackett 16 hours ago
  
  All the indemnities I’ve read have clauses though that say if you intentionally use it to make something copyrighted they won’t protect you.
  So if you put obviously copyrighted things in the prompt you’ll still be on your own.
  
  Reply View | 0 replies
- htrp 16 hours ago
  
  Adobe Firefly absolutely has a spider man problem.
  
  Reply View | 0 replies

TheRoque 14 hours ago

Whistleblowers, corporate leaks, output resembling copyrighted content etc. Basically it feels it's the same as the companies who unlawfully use licensed code as their own (e.g. without respecting GPL license)

Reply View 0 replies

hxseven 17 hours ago

Netflix could also use or provide their own TV/movie productions as training data.

Reply View 3 replies

simonw 17 hours ago

Lionsgate tried that and found that even their entire archive wasn't nearly enough to produce a useful model: https://www.thewrap.com/lionsgate-runway-ai-deal-ip-model-co... and https://futurism.com/artificial-intelligence/lionsgate-movie...

Reply View | 1 reply
- protocolture 16 hours ago
  
  This amuses me.
  Consumers have long wanted a single place to access all content. Netflix was probably the closest that ever got, and even then it had regional difficulties. As competitors rose, they stopped licensing their content to netflix, and netflix is now arguably just another face in the crowd.
  Now they want to go and leverage AI to produce more content and bam, stung by the same bee. No one is going to license their content for training, if the results of that training will be used in perpetuity. They will want a permanent cut. Which means they either need to support fair use, or more likely, they will all put up a big wall and suck eggs.
  
  Reply View | 0 replies
barbazoo 17 hours ago

Maybe now all that product placement is finally coming back to haunt them.

Reply View | 0 replies