j16sdiz 16 hours ago

next _50_ tokens 42% of the time

not just next token.

This is like: tell it a random sentence in the book, it will give you the next sentence 42% of time.

[removed] 16 hours ago
[deleted]
asplake a day ago

“… well enough to reproduce 50-token excerpts at least half the time”

chiph2o a day ago

This means that if we start with 50% of the book then there is 42% chance that we can recreate the remaining 50%.

What is the distinction between understanding and memorization? What is the chance that understanding results in memorization (may be in case of humans)?

  • [removed] 16 hours ago
    [deleted]
  • ipaddr 14 hours ago

    It stores how often characters will come next based on how often they happen in copyright material. It can reproduce parts because those values are a fingerprint.

    It should break copyright laws as written now but too much money involved.