Comment by frogperson

Comment by frogperson 2 days ago

We need a crowd sourced list like adgaurd, but for bots. Id love to block all those ips at the firewall.

The only way you can block these "AI" scrapers is a combination of IP filtering (https://spur.us/) and Fingerprinting (https://abrahamjuliot.github.io/creepjs/).

Things like browserbase are easy to block with this. It's a losing battle though, personally moved entirely to real environments for https://browser.cash/developers

Reply View 0 replies

mrweasel 2 days ago

So that would be at least: GCP, Azure, Alibaba, AWS, Huawei, AT&T, BT, Cox... it's a long list.

User Agents then? No, because that would be: Chrome and Safari.

It's an uphill battle, because the bot authors do not give a shit. You can now buy bot network from actual companies, who embed proxies in free phone games. Anthropic was caught hiding behind Browserbase, and neither of the companies seems to see problem with that.

Reply View 0 replies

jarofgreen 2 days ago

User agents not IPs, but: https://github.com/ai-robots-txt/ai.robots.txt

Reply View 0 replies

dotancohen 2 days ago

A large portion of those addresses will be valid residential IP addresses running malware on compromised Windows machines.

Reply View 0 replies

venturecruelty 2 days ago

Block GCP, AWS, Azure, and various datacenter prefixen, and you're pretty much golden. There are scant few legitimate reasons a human being's traffic would originate from those hosts.

Reply View 5 replies

bdcravens 2 days ago

You can run virtual desktops in the cloud, like AWS's Workspaces, sold as a business rather than developer offering. AWS does publish the IP range those clients use, and I assume other similar offerings out there do the same.

Reply View | 4 replies
- prmoustache 2 days ago
  
  I am working from a cloud desktop but I am only visiting corporate approved resources from that cloud desktop and I believe that is the case of most cloud desktop users as the whole point is to have a clear separation of duties.
  
  Reply View | 1 reply
  
  bdcravens 2 days ago
  
  Correct, but I don't think it's a safe assumption that approved resources wouldn't have a reason to block requests from the cloud.
  
  Reply View | 0 replies
- johneth 2 days ago
  
  I'm sure people who can afford to run virtual desktops in the cloud can also afford a phone/laptop/desktop to access sites that block those virtual desktops in the cloud.
  
  Reply View | 1 reply
  
  bdcravens 2 days ago
  
  I'm thinking more along the lines of people using virtual desktops assigned by their job, and those sites are part of their work. I don't feel like punting to BYOD is a good solution.
  
  Reply View | 0 replies