Comment by johnea
Comment by johnea 4 days ago
My biggest bitch is that it requires JS and cookies...
Although the long term problem is the business model of servers paying for all network bandwidth.
Actual human users have consumed a minority of total net bandwidth for decades:
https://www.atom.com/blog/internet-statistics/
Part 4 shows bots out using humans in 1996 8-/
What are "bots"? This needs to include goggleadservices, PIA sharing for profit, real-time ad auctions, and other "non-user" traffic.
The difference between that and the LLM training data scraping, is that the previous non-human traffic was assumed, by site servers, to increase their human traffic, through search engine ranking, and thus their revenue. However the current training data scraping is likely to have the opposite effect: capturing traffic with LLM summaries, instead of redirecting it to original source sites.
This is the first major disruption to the internet's model of finance since ad revenue look over after the dot bomb.
So far, it's in the same category as the environmental disaster in progress, ownership is refusing to acknowledge the problem, and insisting on business as usual.
Rational predictions are that it's not going to end well...
"Although the long term problem is the business model of servers paying for all network bandwidth."
Servers do not "pay for all the network bandwidth" as if they are somehow being targeted for fees and carrying water for the clients that are somehow getting it for "free". Everyone pays for the bandwidth they use, clients, servers, and all the networks in between, one way or another. Nobody out there gets free bandwidth at scale. The AI scrapers are paying lots of money to scrape the internet at the scales they do.