Comment by burnished

Comment by burnished 2 days ago

15 replies

Oooh those guardrails make me angry. I get why they are there (dont poke the bear) but it doesn't make me overlook the self serving hypocrisy involved.

Though I am also generally opposed to the notion of intellectual property whatsoever on the basis that it doesn't seem to serve its intended purpose and what good could be salvaged from its various systems can already be well represented with other existing legal concepts, i.e deceptive behaviors being prosecuted as forms of fraud.

teddyh 2 days ago

The problem is people at large companies creating these AI models, wanting the freedom to copy artists’ works when using it, but these large companies also want to keep copyright protection intact, for their regular business activities. They want to eat the cake and have it too. And they are arguing for essentially eliminating copyright for their specific purpose and convenience, when copyright has virtually never been loosened for the public’s convenience, even when the exceptions the public asks for are often minor and laudable. If these companies were to argue that copyright should be eliminated because of this new technology, I might not object. But now that they come and ask… no, they pretend to already have, a copyright exception for their specific use, I will happily turn around and use their own copyright maximalist arguments against them.

(Copied from a comment of mine written more than three years ago: <https://news.ycombinator.com/item?id=33582047>)

  • ToValueFunfetti 2 days ago

    I don't care for this line of argument. It's like saying you can't hold a position that trespassing should be illegal while also holding that commercial businesses should be legally required to have public restrooms. Yes, both of these positions are related to land rights and the former is pro- while the latter is anti-, but it's a perfectly coherent set of positions. OpenAI can absolutely be anti-copyright in the sense of whether you can train an an NN on copyrighted data and pro-copyright in the sense of whether you can make an exact replica of some data and sell it as your own without making it into hypocrisy territory. It does suggest they're self-interested, but you have to climb a mountain in Tibet to find anybody who isn't.

    Arguments that make a case that NN training is copyright violation are much more compelling to me than this.

    • belorn a day ago

      The example you gave with public restroom do not work because of two main concept: They are usually getting paid for it by the government, and operating a company usually holds benefits given by the government. Industry regulations as a concept is generally justified in that industry are getting "something" from society, and thus society can put in requirements in return.

      A regulation that require restaurants to have a public bathroom is more akin to regulation that also require restaurants to check id when selling alcohol to young customers. Neither requirement has any relation with land rights, but is related to the right of operating a company that sell food to the public.

      • satvikpendem 6 hours ago

        > The example you gave with public restroom do not work because of two main concept: They are usually getting paid for it by the government, and operating a company usually holds benefits given by the government.

        This is not the case in the US yet many places still have public restrooms, due to it benefiting the users themselves regardless of government.

      • trentlott a day ago

        But what if businesses got benefits from society and tax money and were free to ignore the needs/desires of those who pay taxes and who society consists of? That seems just about right.

    • TremendousJudge 2 days ago

      No, the exception they are asking for (we can train on copyrighted material and the image produced is non-copyright infringing) is copyright infringing in the most basic sense.

      I'll prove it by induction: Imagine that I have a service where I "train" a model on a single image of Indiana Jones. Now you prompt it, and my model "generates" the same image. I sell you this service, and no money goes to the copyright holder of the original image. This is obviously infringment.

      There's no reason why training on a billion images is any different, besides the fact that the lines are blurred by the model weights not being parseable

      • slidehero a day ago

        >There's no reason why training on a billion images is any different

        You gloss over this as if it's a given. I don't agree. I think you're doing a different thing when you're sampling billions of things equallly.

  • jofla_net 2 days ago

    I guess the best explanation for what we're witnessing is the notion that 'Money Talks', and sadly nothing more. To think thats all that fair use activists lacked in years passed..

theshrike79 a day ago

It's not just the guardrails, but the ham-fisted implementation.

Grok is supposed to be "uncensored", but there are very specific words you just can't use when asking it to generate images. It'll just flat out refuse or give an error message during generation.

But, again, if you go in a roundabout way and avoid the specific terms you can still get what you want. So why bother?

Is it about not wanting bad PR or avoiding litigation?

  • mrweasel a day ago

    The implementation is what gets to me too. Fair enough that a company doesn't want their LLM used in a certain way. That's their choice, even if it's just to avoid getting sued.

    How they then go about implementing those guardrails is pretty telling about their understand and control over what they've build and their line of thinking. Clearly, at no point before releasing their LLMs onto the world did anyone stop and ask: Hey, how do we deal with these things generating unwanted content?

    Resorting to blocking certain terms in the prompts is like searching for keywords in spam emails. "Hey Jim, I got another spam email from that Chinese tire place" - "No worry boss, I've configured the mail server to just delete any email containing the words China or tire".

    Some journalist should go to a few of these AI companies and start asking questions about the long term effectiveness and viability of just blocking keywords in prompts.