Comment by defrost

jsheard 6 months ago

It is known that the LAION dataset underpinning foundation models like Stable Diffusion contained at least a few thousand instances of real-life CSAM at one point. I think you would be hard-pressed to prove that any model trained on internet scrapes definitively wasn't trained on any CSAM whatsoever.

https://www.theverge.com/2023/12/20/24009418/generative-ai-i...

Reply View 16 replies

defrost 6 months ago

> I think you would be hard-pressed to prove that any model trained on internet scrapes definitively wasn't trained on any CSAM whatsoever.
I'd be hard pressed to prove that you definitely hadn't killed anybody ever.
Legally if it's asserted that these images are criminal because they are the result of being the product of an LLM trained on sources that contained CSAM then the requirement would be to prove that assertion.
With text and speech you could prompt the model to exactly reproduce a Sarah Silverman monologue and assert that proves her content was used in the training set, etc.
Here the defense would ask the prosecution to demonstrate how to extract a copy of original CSAM.
But your point is well taken, it's likely most image generation programs of this nature have been fed at least one image that was borderline jailbait and likely at least one that was well below the line.

Reply View | 10 replies
- 9rx 6 months ago
  
  > Legally if it's asserted that these images are criminal because they are the result of being the product of an LLM trained on sources that contained CSAM then the requirement would be to prove that assertion.
  Legally, possession of CSAM is against the law because there is an assumption that possession proves contribution to market demand, with an understanding that demand incentives production of supply, meaning there that with demand children will be harmed again to produce more content to satisfy the demand. In other words, the intent is to stop future harm. This is why people have been prosecuted for things like suggestive cartoons that have no real-life events behind them. It is not illegal on the grounds of past events. The actual abuse is illegal on its own standing.
  The provenance of the imagery is irrelevant. What you need to prove is that your desire to have such imagery won't stimulate yourself or others to create new content with real people. If you could somehow prove that LLM content will satisfy all future thirst, problem solved! That would be world changing.
  
  Reply View | 5 replies
  
  harshreality 6 months ago
  
  I'm somewhat sympathetic to that argument. However, it doesn't stop there.
  Violent video games prove contribution to market demand for FPS-style videos of mass shootings or carjackings, so can/should we ban Call of Duty and Grand Theft Auto now?
  (Note that the "market demand" argument is subtly different from the argument that the games directly cause people to become more violent, either in general or by encouraging specific copycat violence. Studies on [lack of] direct violence causation are weak and disputed.)
  
  Reply View | 3 replies
  
  Ferret7446 6 months ago
  
  Such an assumption is wrong in a world with AI generated CSAM. Why would suppliers go through the risk/cost of producing "actual" CSAM if they could AI generate it? Especially if the demand is for AI generated CSAM (someone who has AI generated CSAM is stimulating demand for AI generated CSAM by definition).
  Even for regular porn, which is far lower risk/cost, AI generation is becoming preferable (as with most technologies, the leading use case for AI is porn).
  
  Reply View | 0 replies
- jsheard 6 months ago
  
  Framing it in that way is essentially a get out of jail free card - anyone caught with CSAM can claim it was AI generated by a "clean" model, and how would the prosecution ever be able to prove that it wasn't?
  I get where you are coming from but it doesn't seem actionable in any way that doesn't effectively legalize CSAM possession, so I think courts will have no choice but to put the burden of proof on the accused. If you play with fire then you'd better have the receipts.
  
  Reply View | 3 replies
  
  _aavaa_ 6 months ago
  
  This seems like a long way of saying “guilty until proven innocent”.
  
  Reply View | 2 replies
lazyasciiart 6 months ago

Then all image generation models should be considered inherently harmful, no?

Reply View | 3 replies
- spwa4 6 months ago
  
  But this is the dream for the supposed protectors of children. You see, just because child porn production stops, does not mean those children disappear. Usually, of course, they go into youth services (in practice most don't even make it to the front door and run away to resume the child abuse, but let's ignore that). That is how the situation of those children changes when CSAM is persecuted. From the situation they were in, to whatever situation exists in youth services. In other words, youth services is the upper limit to how much police and anyone CAN help those children.
  So you'd think they would make youth services a good place to be for a child, right. After all, if that situation were to be only marginally better than child prostitution, there's no point to finding CSAM. Or at least, the point is not to protect children, since that is simply not what they're doing.
  So how is youth services doing these days? Well ... NOT good. Regularly children run away from youth services to start doing child porn (ie. live off off an onlyfans account). There's a netflix series on the subject ("Young and locked up") which eventually, reluctantly shows the real problem, the outcome (ie. prison or street poverty).
  In other words your argument doesn't really apply since the goal is not to improve children's well being. If that was the goal, these programs would do entirely different things.
  Goals differ. There's people who go into government with the express purpose to "moralize" and arrest people for offenses. Obviously, to them it's the arresting part that's important, now how serious the offense was and CERTAINLY not if their actions actually help people. And then there's people who simply want a well-paying long-term job where they don't accomplish much. Ironically these are much less damaging, but they still seek to justify their own existence.
  Both groups really, really, really want ALL image generation models to be considered inherently harmful, as you say.
  
  Reply View | 2 replies
  
  lazyasciiart 6 months ago
  
  Yea, the nonprofit for commercially sexually abused children that I volunteer at is a much better way to know about reality. But conspiracies are a comforting way to understand why things can't just be fixed, sure.
  
  Reply View | 1 reply
  
  spwa4 6 months ago
  
  I have experienced child services from the inside, including talking to quite a few kids caught, because caught is the correct word, in "that world". There was not a SINGLE example of one that appreciated the change, and one killed himself. All were essentially locked up. All were aggressive and for every last one the fault was the governments, in 2 cases teachers actively involved in sending students to "that world". Needless to say, these teachers, one who was caught red handed (by the press, no police involvement obviously), were entirely free, and the victims were locked up. ALL of these children had stories of police officers involved in "that world", none of whom had seen any punishment.
  I also can absolutely guarantee you: child services CANNOT protect a child from drugs. Child services CANNOT protect a child from prostitution or sex. Child services CANNOT protect a child from violence, whether they are violent themselves or victim. For the same reason the sea cannot protect fish from water. Sex, drugs and violence are pervasive in even the youngest groups in child services, with people either pretending they're not seeing it or in bad cases participating.
  There is not a doubt in my mind that had the police instead done nothing at all the outcome for all those children would have been better off, not worse. Every last one.
  Oh and not because their situation was great and didn't damage them or any bullshit like that. They would have been better off because child services was orders of magnitude worse, didn't help, and DID fuck up any chance at a future they had (a few, including me, went normally to school. Went, past tense, as in child services made that utterly impossible. Often kid were sent to "idiot-schools" because that got the institution more money, almost always the school or someone at school was the one positive influence in their lives and child services ... just ... doesn't ... care, about such things, brutally and violently changing the school. The kid that later committed suicide beat up the guard of BOTH the idiot school AND hit the "guard" at his previous school and after that got the principal to accept him back, by essentially staying, and sleeping, in the waiting chairs at his office for 3 straight days until he was allowed to talk, then biking to school for 3 hours every day)
  
  Reply View | 0 replies
Hizonner 6 months ago

I think you'd be hard-pressed to prove that a few thousand images (out of over 5 billion in the case of that particular data set) had any meaningful effect on the final model capabilities.

Reply View | 0 replies

Hizonner 6 months ago

> There's a legally challengable assertion there; "trained on CSAM images".

"Legally challengable" only in a pretty tenuous sense that's unlikely to ever haven any actual impact.

That'll be something that's recited as a legislative finding. It's not an element of the offense; nobody has to prove that "on this specific occasion the model was trained in this or that way".

It could theoretically have some impact on a challenge to the constitutionality of the law... but only under pretty unlikely circumstances. First you'd have to get past the presumption that the legislature can make any law it likes regardless of whether it's right about the facts (which, in the US, probably means you have to get courts to take the law under strict scrutiny, which they hate to do). Then you have to prove that that factual claim was actually a critical reason for passing the law, and not just a random aside. Then you have to prove that it's actually false, overcoming a presumption that the legislature properly studied the issue. Then maybe it matters.

I may have the exact structure of that a bit wrong, but that's the flavor of how these things play out.

Reply View 1 reply

defrost 6 months ago

My comment was in response to a portion of the comment above:
> because the machine-learning models utilized by AI have been trained on datasets containing thousands of depictions of known CSAM victims
I'd argue that CSAM imagery falls into two broad categories; actual real photographic image of actual real abuse and generated images (paintings, drawings, animations, etc) and all generated images are more or less equally bad.
There's a peer link in this larger thread ( https://en.wikipedia.org/wiki/Legal_status_of_fictional_porn... ) that indicates at least two US citizen have been charged and sentenced for 20 and 40 years imprisonment each for the possession and distribution of "fictional" child abuse (animated and still japanese cartoon anime, etc).
So, in the wider world, it's a moot point whether these specific images came from training on actual abuse images or not, they depict abuse and that's legally sufficient in the US (apparently), further the same depictions could be generated with or without actual real abuse images and as equivilant images either way they'd be equally offensive.

Reply View | 0 replies

yellowapple 6 months ago

Exactly. The abundance of AI-generated renditions of Shrimp Jesus doesn't mean it was trained on actual photos of an actual Shrimp Jesus.

Reply View 0 replies