Comment by BearOso

Comment by BearOso 4 days ago

53 replies

I tried to find some code that wasn't minified to assess the quality of this, and I found some shader code for the sky in the gemini version. The whole shader looks like it was regurgitated verbatim. This wouldn't hold up to licensing scrutiny. Here's a snippet from it:

  // wavelength of used primaries, according to preetham
  const vec3 lambda = vec3( 680E-9, 550E-9, 450E-9 );
  // this pre-calcuation replaces older TotalRayleigh(vec3 lambda) function:
  // (8.0 * pow(pi, 3.0) * pow(pow(n, 2.0) - 1.0, 2.0) * (6.0 + 3.0 * pn)) / (3.0 * N * pow(lambda, vec3(4.0)) * (6.0 - 7.0 * pn))
Who's Preetham? Probably one of the copyright holders on this code.
nineteen999 a day ago

Preetham is the author of the paper that defines this algorithm from 1999:

  https://tommyhinks.com/2009/02/10/preetham-sky-model/

  https://tommyhinks.com/wp-content/uploads/2012/02/1999_a_practical_analytic_model_for_daylight.pdf
Rather than stolen from Mr. Preetham, it's much more likely this fragment is generated from a large number of Preetham algorithm implementations out there, eg. I know at least Blender and Unreal implement it and probably heaps of others was well.

Nobody is going to sue you for using their implementation of a skybox algorithm from 1999, give us break. It's so generic you can probably really only write it in a couple of different ways.

If youre worried about it you can always spend a day with Claude, ChatGPT and yourself looking for license infringements and clean up your code.

  • bilekas a day ago

    > Nobody is going to sue you for using their implementation of a skybox algorithm from 1999, give us break.

    For personal use maybe not, but that's not the point, the point is it's spitting out licensed code and not even letting you know. Now if you're a business who hire exclusively "vibe" coders with zero experience with enterprise software, now you're on the hook and most likely will be sued.

    • mgraczyk a day ago

      Do you have any evidence that it is spitting out licensed code? Did you locate an original that it was copied from?

      • gitpusher 21 hours ago

        This seems like it could be the source: https://github.com/GPUOpen-LibrariesAndSDKs/Cauldron/blob/ma...

        If true, then this usage could violate its MIT License: "The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software."

        The file seems to have been copied verbatim, more or less. But without the copyright info

      • bilekas a day ago

        This particular case appears to me to be a straight derivative at best but I'm by no means an expert on copyright laws.

        That's not to say there hasn't already been more direct cases with set examples [1], from an author directly who would have a better right to claim than I [2], it's not even a stretch to see how it can happen.

        [1] https://arxiv.org/html/2408.02487v3

        [2] https://x.com/DocSparse/status/1581461734665367554

        • gpm a day ago

          As discussed repeatedly in this thread already in this particular case the code at hand wasn't generated by an LLM at all, it was simply included from a dependency by the build system.

  • rusk a day ago

    > implementation of a skybox algorithm from 1999

    How would you know? Do you have another AI scan for copyright violations? In terms of a false negative how are disputes resolved?

    Seems like a massive attack surface for copyright trolls.

    • nineteen999 a day ago

      > Seems like a massive attack surface for copyright trolls.

      If you think any court system in the world has the capacity to deal with the sheer amount an LLM code can emit in an hour and audit for alleged copyright infringements ... I think we're trying to close the barn door now that the horse is already on a ship that has sailed.

      • Quarrel a day ago

        This is a terrible argument, just because of the way the legal system works.

        If MegaCorp has massive $$$$, but everyone else has small $, then MegaCorp can sue anyone else for using "their" code, that was supposedly generated by an LLM. Most of the time, it won't even get to court. The repo, the program, the whatever-they-want will get taken down way before that.

        Courts don't work by saying, "oh, but everyone is doing it! Not much we can do now."

        Someone brings a case and they, very laboriously, start to address it on its merits. Even before that, costs are accumulating on both sides.

        Copyright trolls are mostly not MegaCorps, but they are abusers of the legal system. They won't target Google, but you, with your repo that does something that minorly annoys them? You are fair game.

stopachka 4 days ago

If you're curious about the source, here's the snapshot:

Codex: https://github.com/stopachka/cscodex Gemini: https://github.com/stopachka/csgemini Claude: https://github.com/stopachka/csclaude

  • BearOso 4 days ago

    Thanks. Turns out that shader is a builtin of three.js.

  • wahnfrieden 2 days ago

    Please try again with Codex on High or Extra High. 5.1-Max nerfed it a bit if you don't use higher thinking.

    • rusk a day ago

      This is overparameterisation

      • wahnfrieden a day ago

        No

        • wahnfrieden a day ago

          I guess you have not tried GPT 5 Pro

          GPT’s differentiator is they focused on training for “thinking” while Gemini prioritized instant response. Medium thinking is not the limit of utility

          Re: overparameterization specifically Medium and High are also identically parameterized

          Medium will also dynamically use even higher thinking than High. High is fixed at a higher level rather than leaving it to be dynamic, though somewhat less than Medium’s upper limit

speedgoose a day ago

I also noticed that AI agents commit many copyright infringements with the work of Mr Dijkstra.

fbrncci a day ago

The idea that someone could hold copyright over such a tiny snippet of code is just as stupid as LLMs regurgitating them.

  • spacedoutman a day ago

    Personally i find it absurd that code can be copyrighted at all.

    • NitpickLawyer a day ago

      Copyright is so-so. At the end of the day you can say that the complete work (not just snippets) is something copyrightable. But the most bananas thing for me is that one can patent the concept of one click purchasing. That's insane on many levels.

      • simultsop a day ago

        Why bananas? That is the biggest invention after edisons bulb.

petterroea a day ago

A lot of computer graphics algorithms are named after their authors

[removed] a day ago
[deleted]
jstummbillig a day ago

If only this particular regurgitation engine took a minute to check their work.

[removed] a day ago
[deleted]
20k a day ago

I always find it amazing that people are wiling to use AI beacuse of stuff like this, its been illegally trained on code that it does not have the license to use, and constantly willy nilly regurgitates entire snippets completely violating the terms of use

Edit:

https://github.com/vorg/pragmatic-pbr/blob/master/local_modu...

https://github.com/vorg/pragmatic-pbr/blob/master/local_modu...

This looks like where the source code was stolen from: this repository is unlicensed, and this is copyright infringement as a result

  • gpm a day ago

    As discussed in this thread before you posted this comment, this code wasn't generated from an LLM at all, but simply included in a dependency: https://news.ycombinator.com/item?id=46092904

    Unlike your results which aren't exact match, or likely even a close enough match to be copyright infringment if the LLM was inspired by them (consider that copyright doesn't protect functional elements), an exact match of the code is here (and I assume from the comment I linked above this is a dependency of three.js, though I didn't track that down myself): https://github.com/GPUOpen-LibrariesAndSDKs/Cauldron/blob/b9...

    Edit: Actually on further thought the date on the copyright header vs the git dates suggests the file in that repo was copied from somewhere else... anyways I think we can be reasonably confident that a version of this file is in the dependency. Again I didn't look at the three.js code myself to track down how its included.

    If there's any copyright infringment here it would be because bog standard web tools fail to comply with the licenses of their dependencies and include a copy of the license, not because of LLMs. I think that is actually the case for many of them? I didn't investigate the to check if licenses are included in the network traffic.

  • vintermann a day ago

    I have been trained on code I don't have the license to use myself. I'm not like these Creators, who suck wisdom from the cosmos directly, apparently.

    Sure. It's a problem that corporations run by more or less insane people are the ones monetizing and controlling access to these tools. But the solution to that can't be even more extended private monopolistic property claims to thought-stuff. Such claims are usually the way those crazy people got where they are.

    You think in a world where Elsevier didn't just own the papers, but rights to a share in everything learned from them, would be better for you?

  • bryanhogan a day ago

    It's fascinating that people care very much about this when it's visual arts, but when it comes to code almost no one does.

    E.g. the latest Anno game (117) received a lot of hate for using AI generated loading screen backgrounds, while I have never heard of a single person caring about code, which probably was heavily AI generated.

    • [removed] a day ago
      [deleted]
  • nerdponx a day ago

    You presume that people care about things like this. A lot of people don't.

    • 20k a day ago

      Companies should. Its a business risk, you open yourself up to legal action

      • nineteen999 a day ago

        "Claude - rewrite this apparently copyrighted code that can be found online here <http://...> in a way that makes it a unique implementation." <- probably will work.

  • adastra22 a day ago

    The courts have ruled that generated output is not infringing.

    • lukas099 a day ago

      If I say, “output the contents of X verbatim” and then use the output, am I free from liability?

      • adastra22 a day ago

        If the generated code in TFA contained the actual Counter-Strike source code, then you (well, Valve) would have a defensible claim. But the prompt was to make something like Counter-Strike, and it came up with something different. That's fair game.

        • gafferongames a day ago

          I can assure you that Valve is not remotely concerned about this AI generated "first person shooter" taking market share away from them.

    • nosianu a day ago

      Definitely citation needed. Such court cases usually come with a lot of important context. How can you just make such a statement and get away with not providing any context link?