Comment by xyzzy_plugh

Comment by xyzzy_plugh a day ago

5 replies

tar/pax are kind of terrible formats. They are hard to implement correctly. I'm glad they are not used more often.

cpio is pretty reasonable though.

zip is actually pretty great and I've been growing increasingly fond of it over the years.

Joker_vD a day ago

The thing is, there is always tar(1) even in the most basic of distributions. And everyone uses tar.gz's or .bz2's or whatever for distributing all kinds of things, so tar is pretty ubiquitous. But the moment you want to do some C development, or any binutils-related, nope, install and use ar(1) which is used for literally one single purpose and nothing else. Because reasons.

  • hyperman1 19 hours ago

    Im not sure how ar does it, but tar has no centralised directory. The only way to get file 100 is to walk trough the 99 files before. This kills random access speed.

    • Joker_vD 42 minutes ago

      Ar puts a file called "/" as the first file of the archive. Inside, there is a number N, then a list of N file offsets, and then a list of N null-terminated strings. It's a symbol table of sorts: each null-terminated string is a symbol name, and the corresponding file offset points at the archive header for the object file that contains the symbol. The filenames themselves are not recorded centrally since it's not really needed.

yjftsjthsd-h a day ago

> tar/pax are kind of terrible formats. They are hard to implement correctly. I'm glad they are not used more often.

I'll grant you "kind of terrible", but what's hard to correctly implement about tar? It's just a bunch of files concatenated together with a tiny chunk of metadata stuck on the front of each.

  • electroly a day ago

    Having never done it myself, I don't know, but I do know that the "microtar" library I picked up off GitHub is buggy when expanding GNU Tar archives but perfect when expanding its own archives. Correctly creating one valid archive is a lot easier than reliably extracting all valid archives. The code appeared competent, I assume tar just has a bunch of historical baggage that you can get wrong or fail to implement.