Comment by Szpadel
Comment by Szpadel 12 hours ago
I had to deal with some bot activities that used huge address space, and I tried something very similar, when condition confirming bot was detected I banned that IP for 24h
but due to amount of IPs involved this did not have any impact on about if traffic
my suggestion is to look very closely on headers that you receive (varnishlog in very nice of this and of you stare long enough at then you might stop something that all those requests have in common that would allow you to easily identify them (like very specific and usual combination of reported language and geo location, or the same outdated browser version, etc)
My favorite example of this was how folks fingerprinted the active probes of the Great Firewall of China. It has a large pool of IP addresses to work with (i.e. all ISPs in China), but the TCP timestamps were shared across a small number of probing machines:
"The figure shows that although the probers use thousands of source IP addresses, they cannot be fully independent, because they share a small number of TCP timestamp sequences"
https://censorbib.nymity.ch/pdf/Alice2020a.pdf