Comment by falcor84

Comment by falcor84 5 hours ago

27 replies

> "The creation of CSAM using AI is inherently harmful to children because the machine-learning models utilized by AI have been trained on datasets containing thousands of depictions of known CSAM victims," it says, "revictimizing these real children by using their likeness to generate AI CSAM images into perpetuity."

The word "inherently" there seems like a big stretch to me. I see how it could be harmful to them, but I also see an argument for how such AI generated material is a substitute for the actual CSAM. Has this actually been studied, or is it a taboo topic for policy research?

defrost 5 hours ago

There's a legally challengable assertion there; "trained on CSAM images".

I imagine an AI image generation model could be readily trained on images of adult soldiers at war and images of children from instagram and then be used to generate imagery of children at war.

I have zero interest in defending exploitation of children, the assertion that children had to have been exploited in order to create images of children engaged in adult activities seems shaky. *

* FWiW I'm sure there are AI models out there that were trained on actual real world CSAM .. it's the implied neccessity that's being questioned here.

  • jsheard 5 hours ago

    It is known that the LAION dataset underpinning foundation models like Stable Diffusion contained at least a few thousand instances of real-life CSAM at one point. I think you would be hard-pressed to prove that any model trained on internet scrapes definitively wasn't trained on any CSAM whatsoever.

    https://www.theverge.com/2023/12/20/24009418/generative-ai-i...

    • defrost 5 hours ago

      > I think you would be hard-pressed to prove that any model trained on internet scrapes definitively wasn't trained on any CSAM whatsoever.

      I'd be hard pressed to prove that you definitely hadn't killed anybody ever.

      Legally if it's asserted that these images are criminal because they are the result of being the product of an LLM trained on sources that contained CSAM then the requirement would be to prove that assertion.

      With text and speech you could prompt the model to exactly reproduce a Sarah Silverman monologue and assert that proves her content was used in the training set, etc.

      Here the defense would ask the prosecution to demonstrate how to extract a copy of original CSAM.

      But your point is well taken, it's likely most image generation programs of this nature have been fed at least one image that was borderline jailbait and likely at least one that was well below the line.

      • 9rx 3 hours ago

        > Legally if it's asserted that these images are criminal because they are the result of being the product of an LLM trained on sources that contained CSAM then the requirement would be to prove that assertion.

        Legally, possession of CSAM is against the law because there is an assumption that possession proves contribution to market demand, with an understanding that demand incentives production of supply, meaning there that with demand children will be harmed again to produce more content to satisfy the demand. In other words, the intent is to stop future harm. This is why people have been prosecuted for things like suggestive cartoons that have no real-life events behind them. It is not illegal on the grounds of past events. The actual abuse is illegal on its own standing.

        The provenance of the imagery is irrelevant. What you need to prove is that your desire to have such imagery won't stimulate yourself or others to create new content with real people. If you could somehow prove that LLM content will satisfy all future thirst, problem solved! That would be world changing.

      • jsheard 4 hours ago

        Framing it in that way is essentially a get out of jail free card - anyone caught with CSAM can claim it was AI generated by a "clean" model, and how would the prosecution ever be able to prove that it wasn't?

        I get where you are coming from but it doesn't seem actionable in any way that doesn't effectively legalize CSAM possession, so I think courts will have no choice but to put the burden of proof on the accused. If you play with fire then you'd better have the receipts.

    • lazyasciiart 5 hours ago

      Then all image generation models should be considered inherently harmful, no?

    • Hizonner 4 hours ago

      I think you'd be hard-pressed to prove that a few thousand images (out of over 5 billion in the case of that particular data set) had any meaningful effect on the final model capabilities.

  • Hizonner 4 hours ago

    > There's a legally challengable assertion there; "trained on CSAM images".

    "Legally challengable" only in a pretty tenuous sense that's unlikely to ever haven any actual impact.

    That'll be something that's recited as a legislative finding. It's not an element of the offense; nobody has to prove that "on this specific occasion the model was trained in this or that way".

    It could theoretically have some impact on a challenge to the constitutionality of the law... but only under pretty unlikely circumstances. First you'd have to get past the presumption that the legislature can make any law it likes regardless of whether it's right about the facts (which, in the US, probably means you have to get courts to take the law under strict scrutiny, which they hate to do). Then you have to prove that that factual claim was actually a critical reason for passing the law, and not just a random aside. Then you have to prove that it's actually false, overcoming a presumption that the legislature properly studied the issue. Then maybe it matters.

    I may have the exact structure of that a bit wrong, but that's the flavor of how these things play out.

    • defrost 4 hours ago

      My comment was in response to a portion of the comment above:

      > because the machine-learning models utilized by AI have been trained on datasets containing thousands of depictions of known CSAM victims

      I'd argue that CSAM imagery falls into two broad categories; actual real photographic image of actual real abuse and generated images (paintings, drawings, animations, etc) and all generated images are more or less equally bad.

      There's a peer link in this larger thread ( https://en.wikipedia.org/wiki/Legal_status_of_fictional_porn... ) that indicates at least two US citizen have been charged and sentenced for 20 and 40 years imprisonment each for the possession and distribution of "fictional" child abuse (animated and still japanese cartoon anime, etc).

      So, in the wider world, it's a moot point whether these specific images came from training on actual abuse images or not, they depict abuse and that's legally sufficient in the US (apparently), further the same depictions could be generated with or without actual real abuse images and as equivilant images either way they'd be equally offensive.

  • yellowapple an hour ago

    Exactly. The abundance of AI-generated renditions of Shrimp Jesus doesn't mean it was trained on actual photos of an actual Shrimp Jesus.

metalcrow 5 hours ago

https://en.wikipedia.org/wiki/Relationship_between_child_por... is a good starting link on this. When i last checked, there were maybe 5 studies total (imagine how hard it is to get those approved by the ethics committees), all of which found different results, some totally the opposite of each other.

Then again, it already seems clear that violent video games do not cause violence, and access to pornography does not increase sexual violence, so this case being the opposite would be unusual.

  • harshreality 2 hours ago

    The few studies on "video games cause violence" I've seen have been extremely limited in scope. They're too concerned with short-term frequency of playing particular games, or desire to play them. They're not concerned enough with the influence of being quite familiar with such games, or how cultural prevalence of such games normalizes thoughts of certain behaviors and shifts the Overton window. There are also selection bias problems. I'd expect media and games to more greatly affect people already psychologically unstable or on the criminal fringe... not people likely to be study participants.

    Studies on sexual violence and changes in that over time have even more problems, for example how difficult it is to get average people to report accurately about their private relationships. Those people likely to volunteer accurate information are not necessarily representative.

Hizonner 4 hours ago

The word "revictimizing" seems like an even bigger stretch. Assuming the output images don't actually look like them personally (and they won't), how exactly are they more victimized than anybody else in the training data? Those other people's likenesses are also "being used to generate AI CSAM images into perpetuity"... in a sort of attenuated way that's hard to even find if you're not desperately trying to come up with something.

The cold fact is that people want to outlaw this stuff because they find it icky. Since they know it's not socially acceptable (quite yet) to say that, they tend to cast about wildly until they find something to say that sort of sounds like somebody is actually harmed. They don't think critically about it once they land on a "justification". You're not supposed to think critically about it either.

  • [removed] 3 hours ago
    [deleted]
paulryanrogers 5 hours ago

Benefiting from illegal acts is also a crime, even if indirect. Like getting a cheap stereo that happens to have been stolen.

A case could also be made that the likenesses of the victims could retramatize them, especially if someone knew the connection and continued to produce similar output to taunt them.

ashleyn 4 hours ago

I always found these arguments to be contrived especially when it's already well-known that in the tradition of every Western government, there is no actual imperative for every crime to be linked directly to a victim. It's a far better argument to me, to suggest that the societal utility in using the material to identify and remove paedophiles before they have an actual victim far exceeds the utility of any sort of "freedom" to such material.

  • yellowapple an hour ago

    > It's a far better argument to me, to suggest that the societal utility in using the material to identify and remove paedophiles before they have an actual victim far exceeds the utility of any sort of "freedom" to such material.

    Or at the very least: flag pedophiles for further investigation, including one's social network. Even if a given pedophile hasn't actually harmed any children, one probably is acquainted with other pedophiles, possibly ones who have harmed children.

    Softening the punishment for CSAM possession in and of itself could actually help with investigating the creation of (non-simulated) CSAM by this logic, since people tend to be more cooperative to investigators when they think they have done nothing wrong and have nothing to hide.

willis936 5 hours ago

It sounds like we should be asking "why is it okay that the people training the models have CSAM?" It's not like it's legal to have, let alone distribute in your for-profit tool.

  • wongarsu 4 hours ago

    If you crawl any sufficiently large public collection of images you are bound to download some CSAM images by accident.

    Filtering out any images of beaten up naked 7 year olds is certainly something you should do. But if you go by the US legal definition of "any visual depiction of sexually explicit conduct involving a minor (someone under 18 years of age)" you are going to have a really hard time filtering all of that automatically. People don't suddenly look differently when they turn 18, and "sexually explicit" is a wide net open to interpretation.

  • wbl 5 hours ago

    Read the sentence again. It doesn't claim the data set has CSAM but that it depicts victims. It also assumes that you need AI to see an example to draw it on demand which isn't true.

    • grapesodaaaaa 5 hours ago

      Yeah. I don’t like it, but I can see this getting overturned.

ilaksh 5 hours ago

You probably have a point and I am not sure that these people know how image generation actually works.

But regardless of a likely erroneous legal definition, it seems obvious that there needs to be a law in order to protect children. Because you can't tell.

Just like there should be a law against abusing robots that look like extremely lifelike children in the future when that is possible. Or any kind of abuse of adult lifelike robots either.

Because the behavior is too similar and it's too hard to tell the difference between real and imagined. So allowing the imaginary will lead to more of the real, sometimes without the person even knowing.