Comment by defanor

Comment by defanor 11 days ago

0 replies

I do store and update backups of public information regularly, since losing access to much of the Internet completely is not such a remote possibility here; many resources are blocked already, both proxying services and protocols are being blocked as well. Storing those backups together with personal data backups.

The things I store are those that seem valuable and information-dense, the kinds that I would be able to use in a relatively prolonged isolation. Storage space is limited, and redundancy is important for backups, so more copies of important information are preferable, to some extent, over added less important information. That is, one may consider tiered backups.

Wikipedia and Project Gutenberg, perhaps as OpenZIM archives (for Kiwix, making them more readily accessible), look like good starting points, along with other Wikimedia projects (e.g., Wikisource, Wiktionary; also available as OpenZIM archives). A music collection is a part of my personal backups. Then there are textbooks: OpenStax provides good ones under the CC BY license, LibreTexts books are of variable quality, but also worthwhile to look into, while WikiBooks are mostly disappointing. Then one may consider copyright-infringing book libraries, if one is fine with those. A few hundred gibibytes seem sufficient for a decent stockpile, including a good chunk of human knowledge, and providing plenty to do alone (read and study, that is).

Textbooks could be much more lightweight if their sources (e.g., in LaTeX) were provided, rather than PDFs, but unfortunately even for those under permissive licenses, usually only PDFs are available, which hinders both printing (as another form of backups) and regular digital data storage.

I expect the government will block software repositories among the last ones, so not backing up those yet, but mirroring, say, Debian archive (including sources) may be a good idea for such a situation, or when preparing for the Internet to go down.

If one has a lot of extra storage available, other easily available large data dumps to consider are Common Crawl, arXiv bulk data downloads, complete OSM data, huge copyright-infringing libraries, and videos: plenty of nice YouTube channels and TV series.