Comment by shkkmo

Comment by shkkmo 4 days ago

12 replies

The explanation of how the estimate is made is more detailed, but here is the referenced conclusion:

>> So (11508 websites * 2^16 sha256 operations) / 2^21, that’s about 6 minutes to mine enough tokens for every single Anubis deployment in the world. That means the cost of unrestricted crawler access to the internet for a week is approximately $0.

>> In fact, I don’t think we reach a single cent per month in compute costs until several million sites have deployed Anubis.

kbelder 3 days ago

If you use one solution to browse the entire site, you're linking every pageload to the same session, and can then be easily singled out and blocked. The idea that you can scan a site for a week by solving the riddle once is incorrect. That works for non-abusers.

  • shkkmo 2 days ago

    Well, since they can get a unique token for every site every 6 minutes only using a free GCP VPS that doesn't really matter, scraping can easily be spread out across tokens or they can cheaply and quickly get a new one whenever the old one gets blocked.

hiccuphippo 4 days ago

Wasn't sha256 designed to be very fast to generate? They should be using bcrypt or something similar.

  • throwawayffffas 4 days ago

    Unless they require a new token for each new request or every x minutes or something it won't matter.

    And as the poster mentioned if you are running an AI model you probably have GPUs to spare. Unlike the dev working from a 5 year old Thinkpad or their phone.

    • _flux 3 days ago

      Apparently bcrypt has design that makes it difficult to accelerate effectively on a GPU.

      Indeed a new token should be requested per request; the tokens could also be pre-calculated, so that while the user is browsing a page, the browser could calculate tickets suitable to access the next likely browsing targets (e.g. the "next" button).

      The biggest downside I see is that mobile devices would likely suffer. Possible the difficulty of the challange is/should be varied by other metrics, such as the number of requests arriving per time unit from a C-class network etc.

debugnik 4 days ago

That's a matter of increasing the difficulty isn't it? And if the added cost is really negligible, we can just switch to a "refresh" challenge for the same added latency and without burning energy for no reason.

  • Retr0id 4 days ago

    If you increase the difficulty much beyond what it currently is, legitimate users end up having to wait for ages.

    • debugnik 3 days ago

      And if you don't increase it, crawlers will DoS the sites again and legitimate users will have to wait until the next tech hype bubble for the site to load, which is the reason why software like Anubis is being installed in the first place.

      • shkkmo 3 days ago

        If you triple the difficulty, the cost of solving the PoW is still neglible to the crawlers but you've harmed real users even more.

        The reason why anubis works is not the PoW, it is that the dev time to implement the bypass takes out the lowest effort bots. Thus the correct response is to keep the PoW difficulty low so you minimize harm to real users. Or better yet, implementing your own custom check that doesn't use any PoW and relies on ever higher obscurity to block the low effort bots.

        The more anubis is used, the less effective it is and the more it harms real users.

  • therein 4 days ago

    I am guessing you don't realize that that means people using not the latest generation phones will suffer.

    • debugnik 3 days ago

      I'm not using the latest generation of phones, not in the slightest, and I don't really care, because the alternative to Anubis-like intersitials is the sites not loading at all when they're mass-crawled to death.

  • [removed] 4 days ago
    [deleted]