Comment by tonymet

Comment by tonymet 19 hours ago

4 replies

You can’t get much crawling done from published cloud IPs. Residential proxies are the only way to do most crawls today.

That said, I support Google working to shut these networks down, since they are almost universally bad.

It’s just a shame that there’s no where to go for legitimate crawling activities.

mrweasel 17 hours ago

> You can’t get much crawling done from published cloud IPs.

Think about why that might be. I'm sorry, if you legitimately need to crawl the net, and do so from a cloud provide, your industry screwed you over with bad behaviour. Go get hosting with a company that cares about who their customers are, you're hanging out with a bad crowd.

  • tonymet 17 hours ago

    what industry is that? Every industry is on the cloud.

    • mrweasel 17 hours ago

      No, no they really aren't, but I was thinking the "scraping industry" in the sense that that's a thing. Getting hosting in smaller datacenters is simple enough, but you may need to manage your own hardware, or VMs. Many will help you get your own IP ranges and ASN, that's going to go a long way, if you don't want to get bundled in with the bad bots.

      This differs obviously, but having an ASN in our case means that we can deal you, contact you and assume that you're better than random bot number 817.

      • tonymet 15 hours ago

        Scraping isn’t an industry. There are legitimate and illegitimate scraping pursuits.

        There are lots of healthy / productive businesses in the cloud and lots of scumbags, just like any enterprise.

        I still have no idea about your point, by the way.