Comment by seba_dos1

Comment by seba_dos1 3 days ago

6 replies

> with tools like Anubis being largely ineffective

To the contrary - if someone "bypasses" Anubis by setting the user agent to Googlebot (or curl), it means it's effective. Every Anubis installation I've been involved with so far explicitly allowed curl. If you think it's counterproductive, you probably just don't understand why it's there in the first place.

jgalt212 3 days ago

If you're installing Anubis, why are you setting it to allow curl to bypass?

  • seba_dos1 3 days ago

    The problem you usually attempt to alleviate by using Anubis is that you get hit by load generated by aggressive AI scrappers that are otherwise indistinguishable from real users. As soon as the bot is polite enough to identify as some kind of a bot, the problem's gone, as you can apply your regular measures for rate limiting and access control now.

    (yes, there are also people who use it as an anti-AI statement, but that's not the reason why it's used on the most high-profile installations out there)

    • stingraycharles 12 hours ago

      Yeah that makes sense. Bad players will try to look like a regular browser, good players will have no problems revealing they’re a bot.

      • [removed] 11 hours ago
        [deleted]
    • lxgr 4 hours ago

      > As soon as the bot is polite enough to identify as some kind of a bot, the problem's gone, as you can apply your regular measures for rate limiting and access control now.

      Very interesting, so we're about to come full circle?

      Can't wait to have to mask myself as a (paying?) AI scraper to bypass annoying captchas when accessing "bot protected" websites...