Comment by moreati

Comment by moreati 2 days ago

5 replies

Why would one want this? Are there particular situation(s) that it's desirable to detect a TCP proxy? Does presence of a TCP proxy indicate some adverserial behaviour? E.g. surveillance, censorship, a particular attack?

userbinator 2 days ago

Surveillance, on the part of those who want to do this fingerprinting.

dlenski a day ago

Came here to ask the same thing. Why do I _care_ if connections to my server come from a TCP proxy? Particularly when a VPN is _not_ observable in a similar way?

Is there some class of bad actors who extensively use TCP proxies and not only _don't_ use VPNs, but would incur large costs in switching to them?

  • JDye a day ago

    Web scrapers maybe aren't "bad actors", but many sites dont want them. They'll use tons of TCP proxies which route them through a rotating pool of end user devices (mobiles, routers, etc...). Its not really possible to block these IPs as you'd also be blocking legitimate customers so other ways to detect and block are required.

    • dlenski a day ago

      Can't/won't these scrapers just switch to using VPNs or sshuttle or basically anything else that doesn't leak timing info about termination of TCP vs HTTP?

      • JDye a day ago

        Not really. You can have 100,000 IPs from proxies or use VPNs and have only 5 egress IPs.

        Anybody who wants to stop the scraper could get browser fingerprints, cross reference similar ones with those IPs and quite safely ban them as its highly likely theyre not a legitimate customer.

        Its a lot harder to do it for the 100k IPs because those IPs will also have legitimate customer traffic on them and its a lot more likely the browser fingerprint could just be legitimate.

        The risk of false postives (blocking real people) is usually higher than just allowing the scrapers and the incetives of a lot of sites arent aligned with stopping scrapers anyway. Think eccommerce, do they _really_ care if the product is being sold to scalpers or real customers? If anything, that behaviour can raise perception of their brand, increase demand, increase prices.

        This tool should have less false positives than most, so maybe it will see more adoption than others (TCP fingerprinting for example) but I dont think this is going to affect anyone doing scraping seriously/at scale.