jsnell 9 hours ago

I don't think it's a cloud. It's more likely a residential proxy network, which are typically created by installing malware on users' machines.

The operators of these proxy networks want to avoid detection by both the users whose bandwidth they're stealing, and by the companies whose data is being scraped. So they want to make the bandwidth very expensive. And that expensive bandwidth in turn means that their only clients are dodgy as well. Either people looking to scrape data without consent and monetize it, or outright criminals.

  • iforgotpassword 9 hours ago

    I use one. I run a bot on IRC that extracts the <title> of every link posted (or downloads the image/whatever and extracts Metadata) and announces that to the channel. It has become more and more pointless to run this on a vps. Google/YouTube block the IP range, a lot of websites return the cloudflare security check, Amazon works on some days and doesn't on others... Ever since I proxy via residential proxies it just works. I'm a smooth criminal. :>

    • derekzhouzhen 5 hours ago

      I feel your pain, but I refuse to cave. Say, 10% of the links fail to load, so what? It is their loss, not mine.

  • dewey 6 hours ago

    There's many reputable residential proxy networks too, usually there's a lot of vetting involved too as they don't want people running illegal activities though their network.

    It's almost a necessity these days to have access to that due to how much datacenter ranges are blocked.

  • bscphil 9 hours ago

    It's kind of surprising that a presumptively legitimate company (and YC-funded startup) would out themselves as buying black market residential proxy bandwidth, isn't it?

    • jsheard 9 hours ago

      Their frontpage also advertises the ability to pass CAPTCHAs, whether by automation or more likely by delegating them to third-world CAPTCHA farms. If that's a major selling point for your automation service then your target market probably ranges from dubious (e.g. data scrapers trying to get around limits) to extremely dubious (e.g. ticket scalpers, spammers, click fraud, etc).

      • xp84 6 hours ago

        Just because something can be used for sketchy purposes doesn't mean that's the only purpose of it. there are thousands of situations where people are forced to interact with a shitty website 100x per day and the site won't provide an api. Imagine if your job was booking plane tickets all day. United could provide you an API key to do so via an API, but in practice they won't, only some enterprisey travel software company can get that kind of access, for a steep fee. You could build a tool which automatically puts together an itinerary based on rules and books it, through a tool like this. Perhaps a slightly contrived example but I believe things like this definitely happen.

    • dewey 6 hours ago

      Residential proxies are not necessarily "black market".

      • asmor 4 hours ago

        It's almost never done with the full understanding of the person providing the proxy, doesn't matter if they get promised some change, their browser addons betray them or they install bundleware/adware.

        I'd say it has about the same moral standing as a payday loan.

        • dewey 4 hours ago

          There’s other ways for example through mislabeled “residential” blocks, or “residential” proxies that are sold by ISPs to vendors.

    • mrguyorama 9 hours ago

      How long have you been here? It's not surprising at all. HN and YC have not demonstrated an aversion to "uh, greyhat" activity.

      If it were 2000, people would be sharing their ad clicking startups.

      YC has funded a looooooot of sketchy companies.

  • floam 9 hours ago

    It’s not necessarily malware. There are services that are pretty upfront and pay cash money for residential US bandwidth. That said, naive people might be surprised when their IP starts getting blocked.

    e.g. https://www.honeygain.com/ (something like 100GB = $20).

    • Saris 2 hours ago

      >That said, naive people might be surprised when their IP starts getting blocked.

      Or law enforcement shows up at their door because their IP is involved in a bunch of illegal stuff.

  • peab 9 hours ago

    how does expensive bandwidth equate to dodgy clients? There are lot's of valid use cases for scraping data, and it's legal to scrape publicly available data, even if the websites hosting it try to block it (try a curl request to reddit, for example)

tux3 9 hours ago

Absolutely wild. A normal price for bandwidth before volume discounts is 1c/GB, or 10 bucks per TB

  • jsheard 9 hours ago

    They're in the business of scraping/botting sites that don't want to be scraped/botted, and bandwidth that looks "legit" comes at a premium.

hooverd 9 hours ago

api.skyvern.com is a CNAME to an EC2 ALB, but even using a NAT Gateway ($$$) I can't make more than $1/GB add up.