Comment by miki123211

Comment by miki123211 2 days ago

1 reply

What Mastodon is doing seems suitably ironic in this situation.

For those unaware, Mastodon's APIs are extremely open and it's very easy to scrape, to the point of providing you with a "firehose" of all public posts that an instance sees, both local and federated, with no authentication required. They also have an extreme anti-scraping culture, anybody who admits to running any kind of scraper which is not strictly opt-in, even for benign / scientific purposes, is very quickly shunned and blocked. Most instances also have a "disallow scraping via robots.txt" policy by default.

The results? I posted a canary token[1] link on a medium-sized, well-federated, well-protected instance which disallows scraping, and it got hit by some shady social media crawler in a fraction of a second. It started getting hit by many other strange crawlers later on, and it still keeps getting visits (mostly from Google now).

shadowgovt 2 days ago

It's an ecosystem of people that seem insistent on the idea that you can put stuff online behind no authentication wall and expect to stay exclusively on servers you believe it should be on.

And I don't know what to tell them, because that's not how the internet has ever worked.