Comment by nonrandomstring

Comment by nonrandomstring 9 days ago

7 replies

> blame here are solely the ones employing these fingerprinting techniques,

Sure. And it's a tragedy. But when you look at the bot situation and the sheer magnitude of resource abuse out there, you have to see it from the other side.

FWIW the conversation mentioned above, we acknowledged that and moved on to talk about behavioural fingerprinting and why it makes sense not to focus on the browser/agent alone but what gets done with it.

NavinF 9 days ago

Last time I saw someone complaining about scrapers, they were talking about 100gib/month. That's 300kbps. Less than $1/month in IP transit and ~$0 in compute. Personally I've never noticed bots show up on a resource graph. As long as you don't block them, they won't bother using more than a few IPs and they'll backoff when they're throttled

  • marcusb 9 days ago

    For some sites, things are a lot worse. See, for example, Jonathan Corbet's report[0].

    0 - https://social.kernel.org/notice/AqJkUigsjad3gQc664

    • NavinF 7 days ago

      He provides no info. req/s? 95%ile mbps? How does he know the requests come from an "AI-scraper" as opposed to a normal L7 DDoS? LWN is a pretty simple site, it should be easy to saturate 10G ports

  • nonrandomstring 9 days ago

    Didn't rachelbytheebay post recently that her blog was being swamped? I've heard that from a few self-hosting bloggers now. And Wikipedia has recently said more than half of traffic is noe bots. ARe you claiming this isn't a real problem?

    • NavinF 7 days ago

      How exactly can a blog get swamped? It takes ~0 compute per request. Yes I'm claiming this is a fake problem

  • lmz 9 days ago

    How can you say it's $0 in compute without knowing if the data returned required any computation?

    • NavinF 7 days ago

      Look at the sibling replies. All the kvetching comes from blogs and simple websites, not the ones that consume compute per request