Llamamoe 2 hours ago

Depending on how important it is for you to maintain original quality, I have in the past had good luck with a combination of prerendering complex content, reducing the DPI and colour depth of images, and recombining them back into PDFs, depending on the file.

You could probably easily automate identifying different editions of the same content, and e.g. only keep an epub with small images, rather than the other 6 and 3 more PDFs as well.

mmooss 8 hours ago

> eliminate large PDFs

How large? Isn't that going to result in an arbitrary filter of books? In other domains, large PDFs are due to PDF production errors, such as using color or needlessly high resolution, and not so much due to the volume of content - at least for text.

cookiengineer 8 hours ago

Let me know of those efforts, I wanna have an English/German/French backup of the archive, too. But as you said HDDs and filesystems are the problem, really.

Maybe I'll have to build a torrent splitter or something, because the UIs of all torrent clients are just not built for that.

brador 2 hours ago

Invert the list, start with the smallest, continue until full.