How to create Debian packages with alternative compression methods

While gzip is the standard Unix tool when it comes to compression, there are other tools available and some of them are performing better than gzip in terms of compression ratio. This article will explain where you can make use of them in your Debian packaging work.

In the source package

A source package is composed of multiple files. The .dsc file is always uncompressed and it’s fine since it’s a small textual file. The upstream tarballs can be compressed with gzip (orig.tar.gz), bzip2 (orig.tar.bz2), lzma (orig.tar.lzma) or xz (orig.tar.xz), so choose the one that you want if upstream provides the tarball compressed with multiple tools. Put it at the right place and dpkg-source will automatically use it. Note however that packages using source format “1.0″ are restricted to gzip, and the main Debian archive currently only allows gzip and bzip2 (xz might be allowed later) even if the source format “3.0 (quilt)” supports all of them.

The debian packaging files are provided either in a .diff.gz file for source format “1.0″ (again only gzip is supported) or in a .debian.tar file for source format “3.0 (quilt)”. The latter tarball can be compressed with the tool of your choice, you just have to tell dpkg-source which one to use (see below, note that gzip is the default).

In a native package, dpkg-source must generate the main tarball and you can instruct it to use another tool than gzip with the --compression option. That option is usually put in debian/source/options:

# Use bzip2 instead of gzip
compression = "bzip2"
compression-level = 9

For “3.0 (quilt)” source packages, this option is not very useful as the debian tarball that gets compressed is usually not very large. But some maintainers like to use the same compression tool for the upstream tarball and the debian tarball, so you can use this option to harmonize both.

In native packages, it’s much more interesting: for instance the size of dpkg’s source package has been reduced of 30% by switching to bzip2, saving 2Mb of disk.

In the binary packages

.deb files also contain compressed tar archives and by default they use gzip as well:

$ ar t dpkg_1.15.9_i386.deb 
debian-binary
control.tar.gz
data.tar.gz

data.tar.gz is the archive that contains all the files to be installed and it’s the one that you can compress with another tool if you want. Again this is mostly interesting for (very) large packages where the size difference clearly justifies deviating from the default compression tool. Try it out and see how many megabytes you can shove. Another aspect that you must keep in mind is that those alternative tools might use important amount of memory to do their job, both for compression and decompression. So if your package is meant to be installed on embedded platforms, or if you want to build your package on low-end hardware with few memory, you might want to stick with gzip.

Now how do you change the compression tool? Easy, dpkg-deb supports a -Z option, so you just have to pass “-Zbzip2″ for example. You can also pass “-z6″ for example to change the compression level to 6 (it’s interesting because a lower compression level might require less memory depending on the tool used). The dpkg-deb invocation is typically hidden behind the call to dh_builddeb in your debian/rules so you have to replace that invocation with “dh_builddeb -- -Zbzip2“.

If you are using a debhelper 7 tiny rules files, you have to add an override like in this example:

%:
	dh $@

override_dh_builddeb:
	dh_builddeb -- -Zbzip2

If you are using CDBS, you have to set the variable DEB_DH_BUILDDEB_ARGS:

include /usr/share/cdbs/1/rules/debhelper.mk
[...]
DEB_DH_BUILDDEB_ARGS = -- -Zbzip2

I hope you found this article helpful. Follow me on identi.ca or on twitter.