While gzip is the standard Unix tool when it comes to compression, there are other tools available and some of them are performing better than gzip in terms of compression ratio. This article will explain where you can make use of them in your Debian packaging work.
In the source package
A source package is composed of multiple files. The .dsc file is always uncompressed and it’s fine since it’s a small textual file. The upstream tarballs can be compressed with gzip (orig.tar.gz), bzip2 (orig.tar.bz2), lzma (orig.tar.lzma) or xz (orig.tar.xz), so choose the one that you want if upstream provides the tarball compressed with multiple tools. Put it at the right place and dpkg-source will automatically use it. Note however that packages using source format “1.0” are restricted to gzip, and the main Debian archive currently only allows gzip and bzip2 (xz might be allowed later) even if the source format “3.0 (quilt)” supports all of them.
The debian packaging files are provided either in a .diff.gz file for source format “1.0” (again only gzip is supported) or in a .debian.tar file for source format “3.0 (quilt)”. The latter tarball can be compressed with the tool of your choice, you just have to tell dpkg-source which one to use (see below, note that gzip is the default).
In a native package, dpkg-source must generate the main tarball and you can instruct it to use another tool than gzip with the
--compression option. That option is usually put in
# Use bzip2 instead of gzip compression = "bzip2" compression-level = 9
For “3.0 (quilt)” source packages, this option is not very useful as the debian tarball that gets compressed is usually not very large. But some maintainers like to use the same compression tool for the upstream tarball and the debian tarball, so you can use this option to harmonize both.
In native packages, it’s much more interesting: for instance the size of dpkg’s source package has been reduced of 30% by switching to bzip2, saving 2Mb of disk.
In the binary packages
.deb files also contain compressed tar archives and by default they use gzip as well:
$ ar t dpkg_1.15.9_i386.deb debian-binary control.tar.gz data.tar.gz
data.tar.gz is the archive that contains all the files to be installed and it’s the one that you can compress with another tool if you want. Again this is mostly interesting for (very) large packages where the size difference clearly justifies deviating from the default compression tool. Try it out and see how many megabytes you can shove. Another aspect that you must keep in mind is that those alternative tools might use important amount of memory to do their job, both for compression and decompression. So if your package is meant to be installed on embedded platforms, or if you want to build your package on low-end hardware with few memory, you might want to stick with gzip.
Now how do you change the compression tool? Easy,
dpkg-deb supports a -Z option, so you just have to pass “-Zbzip2” for example. You can also pass “-z6” for example to change the compression level to 6 (it’s interesting because a lower compression level might require less memory depending on the tool used). The dpkg-deb invocation is typically hidden behind the call to
dh_builddeb in your debian/rules so you have to replace that invocation with “
dh_builddeb -- -Zbzip2“.
If you are using a debhelper 7 tiny rules files, you have to add an override like in this example:
%: dh $@ override_dh_builddeb: dh_builddeb -- -Zbzip2
If you are using CDBS, you have to set the variable DEB_DH_BUILDDEB_ARGS:
include /usr/share/cdbs/1/rules/debhelper.mk [...] DEB_DH_BUILDDEB_ARGS = -- -Zbzip2
I hope you found this article helpful. Follow me on identi.ca or on twitter.
Flamarion Jorge says
Thanks for all the tips.
For people like me who are learning to create Debian packages, and want be a maintainers someday, these tips are valuable.
Adnan Hodzic says
As always, another great tutorial. Raphaël thank you for your great work 🙂
However, I ran into a problem when working with git-buildpackage. Problem with it, is that at very start besides (.tar.bz2) it would immediately create another .tar.gz
"gbp:info: file.orig.tar.gz does not exist, creating from 'upstream/0.8'"
After that dpkg-source would give me an error that there was several orig.tar files found (./file.orig.tar.bz2 and ./file.orig.tar.gz) but only one is allowed. Of course after I would drop .tar.gz and run git-buildpackage .tar.gz would be created again and I was stuck in some sort of infinite loop 🙂
Way to bypass this is to run “git-buildpackage” with “–git-compression=bzip2” option. (git-buildpackage –git-compression=bzip2)
Hope this helps if anyone gets into the same situation as I did 🙂