Comment by gamblor956

Comment by gamblor956 16 hours ago

6 replies

That can easily be written off as fair use.

No, it really couldn't. In fact, it's very persuasive evidence that Llama is straight up violating copyright.

It would be one thing to be able to "predict" a paragraph or two. It's another thing entirely to be able to predict 42% of a book that is several hundred pages long.

reedciccio 16 hours ago

Is it Llama violating the "copyright" or is it the researcher pushing it to do so?

  • lern_too_spel 14 hours ago

    If you distribute a zip file of the book, are you violating copyright, or is it the person who unzips it?

    • TeMPOraL 10 hours ago

      If you walk through the N-gram database with a copy of Harry Potter in hand and observe that for N=7, you can find any piece of it in the database with above-average frequency, does that mean N-gram database is violating copyright?

      • gamblor956 6 hours ago

        If the database is sharing those pieces, it might be yes.

        Copyright takes into account the use for such the copying is done. Commercial use will almost always be treated as not fair use, with limited exceptions.

        • TeMPOraL 5 hours ago

          I'd say no, because you can't reasonably access and order those pieces without already having the work at your side to use as a reference.

    • gamblor956 6 hours ago

      You are.

      Copyright is quite literally about the right to control the creation and distribution of copies.

      The creation of the unzipped file is not treated as a separate copy so the recipient would not be violating copyright just by unzipping the file you provided.