Comment by asdefghyk
re ".....AI training and the thing search engines do to make a search index are essentially the same thing. ...."
Well, AI training has annoyed LOTS people. Overloaded websites.. Done things just because they can . ie Facebook sucking up content of lots pirate books
Since this AI race started our small website is constantly over run by bots and it is not usable by humans because of the load.. NEWER HAD this problem before AI , when just access by search engine indexing .....
This is largely because search engines are a concentrated market and AI training is getting done by everybody with a GPU.
If Google, Bing, Baidu and Yandex each come by and index your website, they each want to visit every page, but there aren't that many such companies. Also, they've been running their indexes for years so most of the pages are already in them and then a refresh is usually 304 Not Modified instead of them downloading the content again.
But now there are suddenly a thousand AI companies and every one of them wants a full copy of your site going back to the beginning of time while starting off with zero of them already cached.
Ironically copyright is actually making this worse, because otherwise someone could put "index of the whole web as of some date in 2023" out there as a torrent and then publish diffs against it each month and they could all go download it from each other instead of each trying to get it directly from you. Which would also make it easier to start a new search engine.