Comment by Arnavion

Comment by Arnavion 4 days ago

62 replies

>This dance to get access is just a minor annoyance for me, but I question how it proves I’m not a bot. These steps can be trivially and cheaply automated.

>I think the end result is just an internet resource I need is a little harder to access, and we have to waste a small amount of energy.

No need to mimic the actual challenge process. Just change your user agent to not have "Mozilla" in it; Anubis only serves you the challenge if it has that. For myself I just made a sideloaded browser extension to override the UA header for the handful of websites I visit that use Anubis, including those two kernel.org domains.

(Why do I do it? For most of them I don't enable JS or cookies for so the challenge wouldn't pass anyway. For the ones that I do enable JS or cookies for, various self-hosted gitlab instances, I don't consent to my electricity being used for this any more than if it was mining Monero or something.)

johnecheck 4 days ago

Sadly, touching the user-agent header more or less instantly makes you uniquely identifiable.

Browser fingerprinting works best against people with unique headers. There's probably millions of people using an untouched safari on iPhone. Once you touch your user-agent header, you're likely the only person in the world with that fingerprint.

  • sillywabbit 3 days ago

    If someone's out to uniquely identify your activity on the internet, your User-Agent string is going to be the least of your problems.

    • _def 3 days ago

      Not sure what you mean, as exactly this is happening currently on 99% of the web. Brought to you by: ads

      • MathMonkeyMan 3 days ago

        If you're browsing with a browser, then there are 1000 ways to identify you. If you're browsing without a browser, then there is at least one way to identify you.

      • amusingimpala75 3 days ago

        I think what they meant is: there’s already so many other ways to fingerprint (say, canvas) that a common user agent doesn’t significantly help you

        • johnecheck 3 days ago

          'There's so many cliffs around that not jumping off that one barely helps you'.

          I meeeeeannn... sure? I know that browser fingerprinting works quite well without, but custom headers are actually a game over in terms of not getting tracked.

      • [removed] 3 days ago
        [deleted]
  • Arnavion 4 days ago

    UA fingerprinting isn't a problem for me. As I said I only modify the UA for the handful of sites that use Anubis that I visit. I trust those sites enough that them fingerprinting me is unlikely, and won't be a problem even if they did.

  • NoMoreNicksLeft 4 days ago

    I'll set mine to "null" if the rest of you will set yours...

    • gabeio 3 days ago

      The string “null” or actually null? I have recently seen a huge amount of bot traffic which has actually no UA and just outright block it. It’s almost entirely (microsoft cloud) Azure script attacks.

  • codedokode 4 days ago

    If your headers are new every time then it is very difficult to figure out who is who.

    • spoaceman7777 4 days ago

      yes, but it puts you in the incredibly small bucket of "users that has weird headers that don't mesh well", and makes using the rest of the (many) other fingerprinting techniques all the more accurate.

    • kelseydh 3 days ago

      It is very easy unless the IP address is also switching up.

    • heavyset_go 3 days ago

      It's very easy to train a model to identify anomalies like that.

      • johnecheck 2 days ago

        While it's definitely possible to train a model for that, 'very easy' is nonsense.

        Unless you've got some superintelligence hidden somewhere, you'd choose a neural net. To train, you need a large supply of LABELED data. Seems like a challenge to build that dataset; after all, we have no scalable method for classifying as of yet.

  • andrewmcwatters 4 days ago

    Yes, but you can take the bet, and win more often than not, that your adversary is most likely not tracking visitor probabilities if you can detect that they aren't using a major fingerprinting provider.

  • [removed] 3 days ago
    [deleted]
  • jagged-chisel 4 days ago

    I wouldn’t think the intention is to s/Mozilla// but to select another well-known UA string.

    • Arnavion 4 days ago

      The string I use in my extension is "anubis is crap". I took it from a different FF extension that had been posted in a /g/ thread about Anubis, which is where I got the idea from in the first place. I don't use other people's extensions if I can help it (because of the obvious risk), but I figured I'd use the same string in my own extension so as to be combined with users of that extension for the sake of user-agent statistics.

      • CursedSilicon 4 days ago

        It's a bit telling that you "don't use extensions if you can help it" but trust advice from a 4chan board

    • soulofmischief 4 days ago

      The UA will be compared to other data points such as screen resolution, fonts, plugins, etc. which means that you are definitely more identifiable if you change just the UA vs changing your entire browser or operating system.

    • throwawayffffas 4 days ago

      I don't think there are any.

      Because servers would serve different content based on user agent virtually all browsers start with Mozilla/5.0...

      • extraduder_ire 3 days ago

        curl, wget, lynx, and elinks all don't by default (I checked). Mainstream web browsers likely all do, and will forever.

        • userbinator 3 days ago

          Anubis will let curl through, while blocking any non-mainstream browser which will likely say "Mozilla" in its UA just for best compatibility and call that a "bot"? WTF.

  • [removed] 4 days ago
    [deleted]
Animats 4 days ago

> (Why do I do it? For most of them I don't enable JS so the challenge wouldn't pass anyway. For the ones that I do enable JS for, various self-hosted gitlab instances, I don't consent to my electricity being used for this any more than if it was mining Monero or something.)

Hm. If your site is "sticky", can it mine Monero or something in the background?

We need a browser warning: "This site is using your computer heavily in a background task. Do you want to stop that?"

  • mikestew 4 days ago

    We need a browser warning: "This site is using your computer heavily in a background task. Do you want to stop that?"

    Doesn't Safari sort of already do that? "This tab is using significant power", or summat? I know I've seen that message, I just don't have a good repro.

    • qualeed 4 days ago

      Edge does, as well. It drops a warning in the middle of the screen, displays the resource-hogging tab, and asks whether you want to force-close the tab or wait.

zahlman 4 days ago

> Just change your user agent to not have "Mozilla" in it. Anubis only serves you the challenge if you have that.

Won't that break many other things? My understanding was that basically everyone's user-agent string nowadays is packed with a full suite of standard lies.

  • Arnavion 4 days ago

    It doesn't break the two kernel.org domains that the article is about, nor any of the others I use. At least not in a way that I noticed.

  • throwawayffffas 4 days ago

    In 2025 I think most of the web has moved on from checking user strings. Your bank might still do it but they won't be running Anubis.

    • Aachen 3 days ago

      Nope, they're on cloudflare so that all my banking traffic can be intercepted by a foreign company I have no relation to. The web is really headed in a great direction :)

    • account42 3 days ago

      The web as a whole definitely has not moved on from that.

msephton 3 days ago

I'm interested in your extension. I'm wondering if I could do something similar to force text encoding of pages into Japanese.

  • Arnavion 3 days ago

    If your Firefox supports sideloading extensions then making extensions that modify request or response headers is easy.

    All the API is documented in https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/Web... . My Anubis extension modifies request headers using `browser.webRequest.onBeforeSendHeaders.addListener()` . Your case sounds like modifying response headers which is `browser.webRequest.onHeadersReceived.addListener()` . Either way the API is all documented there, as is the `manifest.json` that you'll need to write to register this JS code as a background script and whatever permissions you need.

    Then zip the manifest and the script together, rename the zip file to "<id_in_manifest>.xpi", place it in the sideloaded extensions directory (depends on distro, eg /usr/lib/firefox/browser/extensions), restart firefox and it should show up. If you need to debug it, you can use the about:debugging#/runtime/this-firefox page to launch a devtools window connected to the background script.

    • msephton 3 days ago

      Cheers! I'm in Safari so I'll see if there's a match.

semiquaver 3 days ago

Doesn’t that just mean the AI bots can do the same? So what’s the point?

danieltanfh95 3 days ago

wtf? how is this then better than a captcha or something similar?!

throw84a747b4 4 days ago

[flagged]

  • gruez 4 days ago

    >Not only is Anubis a poorly thought out solution from an AI sympathizer [...]

    But the project description describes it as a project to stop AI crawlers?

    > Weighs the soul of incoming HTTP requests to stop AI crawlers

    • throw84a747b4 4 days ago

      Why would a company that wants to stop AI crawlers give talks on LLMs and diffusion models at AI conferences?

      Why would they use AI art for the first Anubis mascot until GitHub users called out the hypocrisy on the issue tracker?

      Why would they use Stable Diffusion art in their blogposts until Mastodon and Bluesky users called them out on it?

      • cyanydeez 4 days ago

        Likely the only way to stop AI is with purpose, fundamentally sound "machine learning", aka, AI.

        AI slop is mass produced, but there's likely great potential for really useful AI models with very limited scopes.

      • Imustaskforhelp 4 days ago

        I am not again AI art completely since I think of it as an editing instead of art itself. My thoughts on AI art are nuanced and worth discussing some other day, lets talk about the author of anubis/story of anubis

        So, I hope you know the entire story behind Anubis, firstly they were hosting their own git server (I think?) and amazon's ai related department was basically ddosing their server in some sense by trying to scrape it and they created anubis in a way to prevent that.

        The idea isn't that new, it is just proof of work and they created it firstly for their own use and I think that they are An AI researcher/ related to AI, so for them using AI pics wasn't that big of a deal and pretty sure that they had some reason behind it and even that has been changed.

        Stop whining about free projects/labour man. The same people comment oh well these AI scrapers are scraping so many websites and taking livelihood of website makers and now you have someone who just gave it to ya for free and you are nitpicking the wrong things.

        You can just fork it without the anime images or without the AI thing if you don't align with them and their philosophy.

        Man now I feel the mandela effect as I read it somewhere on their blog or any thing that they themselves feel the hypocrisy or something along that (pardon me if I am wrong, I usually am) But they themselves (I think?) would like to get rid of working in the AI industry while making anti AI scraper but they might need more donations iirc and they themselves know the hypocrisy.

    • account42 3 days ago

      AI companies are just as interested in stopping competing crawlers as anyone else.

  • [removed] 4 days ago
    [deleted]