Comment by joecool1029
Comment by joecool1029 6 months ago
Even some non-profit ignore it now, Internet Archive stopped respecting it years ago: https://blog.archive.org/2017/04/17/robots-txt-meant-for-sea...
Comment by joecool1029 6 months ago
Even some non-profit ignore it now, Internet Archive stopped respecting it years ago: https://blog.archive.org/2017/04/17/robots-txt-meant-for-sea...
I also don't think they hit servers repeatedly so much
The most recent notice IA have blogged was in 2017, and there's no indication that the service has reversed course on robots.txt since.
IA actually has technical and moral reasons to ignore robots.txt. Namely, they want to circumvent this stuff because their goal is to archive EVERYTHING.