Comment by aspenmayer
Comment by aspenmayer 19 hours ago
Sure, why not? lol
https://www.reddit.com/r/DataHoarder/comments/1entowq/i_made...
https://github.com/shloop/google-book-scraper
The fact that Meta torrented Books3 and other datasets seems to be by self-admission by Meta employees who performed the work and/or oversaw those who themselves did the work, so that is not really under dispute or ambiguous.
https://torrentfreak.com/meta-admits-use-of-pirated-book-dat...
Books3 was used in Llama1. We don't know if they used it later on.