Gzip, Bzip2 and Lzma compared

Filed under

There has recently been a discussion about GNU switching from bzip2 to lzma for their distributed tarballs. They still offer gzip tarballs as an alternative. However, Gentoo has been preferring the bzip2 tarballs mostly due to the improved pack ratio of bzip2.

Unfortunately, the software for lzma is not (yet) as mature as some would like. For example, the format of files produced has changed recently (in a compatible way, though). Also, the current incarnation of the canonical binaries (lzma-utils) by default links against libstdc++.so which is a huge headache for release engineering and the like.

How these distribution problems can/will be solved, remains to be seen. What I'm more interested in, is a comparison of the performance of the three packers. I had initially hoped to also compare the amount of I/O done and memory usage, but GNU time let me down there.

GNU time's manpage claims that it can record and output quite a few figures regarding I/O and memory usage. Unfortunately, I have not been able to make time report anything other than 0 for those interesting stats. Not wanting to debug time, I've chosen to do performance tests regarding pack ratio and execution time, instead.

More Here