Nevermark 4 days ago

Given enough of them, some fraction will always be down. It would be helpful if we had a site that could track that ratio.

hirako2000 4 days ago

It's a centralization vs decentralisation vs distributed system question.

Since down detectors serve to detect failures of centralized (and decentralized systems) the idea would be to at least get that right: a distributed system to detect outages.

You basically run detectors that heartbeat each others. Just a few suffice.

Once you start to see clusters of detectors go silent, you can assume things are falling apart, which is fine so long as a few remain.

Self healing also helps to make the web of nodes resilient to inevitable infrastructure failures.

meken 4 days ago

We could create a linked list of these and just refer to the N’th one as N-down detector.