Comment by reddalo

Comment by reddalo a day ago

10 replies

Off topic, but I'm always amazed by Archive.md/.is/whatever. To this day I don't understand how they manage to bypass a lot of paywalls.

The mystery about the owner makes it even more intriguing.

amouat a day ago

I assume they just pretend to be the Googlebot so the site just gives the text.

  • dewey 21 hours ago

    Won’t work for any popular site. You can try that easily by using extensions to set the user agent. If you are not checking the public list of IPs that Google publishes for the crawler you are doing it wrong.

LordHeini 16 hours ago

I think archive has mostly news, random articles and such.

And as they say nothing is more worthless than yesterday's news.

silcoon 21 hours ago

Maybe they have a paid account? I don’t think there’s much magic behind

  • blast 13 hours ago

    Publications could use watermarking to encode the name of the account an article is being served to, but they don't seem to. I wonder why.

jama211 a day ago

I just assumed they copied it into their own db

moffkalast a day ago

Given to how many people its existence must be incredibly infuriating, it's so odd that it's not being chased down with more haste than pirate bay was. I mean I'm glad it's not, but kinda surprised.

  • nosafemode 18 hours ago

    There has been some dns resolver issues, some DNS resolvers wont return the address to the sites like archive.is or sites like Annas Archive

  • dewey 21 hours ago

    The music or movie industry lobby is much more aggressive I’d assume.