How Ubuntu builds up on Debian

I have been asked how Ubuntu relates to Debian, and how packages flow from one to the other. So here’s my attempt at clarifying the whole picture.

Where do the packages come from?

Most packages are created by Debian contributors and they are uploaded in Debian unstable (or Debian experimental). New packages are reviewed by the Debian ftpmasters before being accepted in the official archive. The packages are held in the NEW queue until the review is over, and the time spent there varies between a few hours and a few months (usually they are processed within one week or two).

Ubuntu imports all the official Debian packages, but they also add some packages of their own. About 7% of the Ubuntu packages are third-party software that have been packaged for Ubuntu but not for Debian.

What are the changes made by Ubuntu?

From all the source packages coming from Debian, 17% have additional changes made by Ubuntu. Many of them are part of the “main” repository, which is actively maintained by Canonical and Ubuntu core developers. The “universe” repository is usually closer to the official Debian packages.

Many of the changes made by Ubuntu are the results of the decisions taken during the Ubuntu Developer Summit in order to reach specific goals: provide a better user interface, offer faster boot times, become a better platform for third-party software developers, offer a good integration with their online services (Launchpad, Ubuntu One), etc. Other changes are simply the result of fixing bugs reported by Ubuntu users.

Note that even non-modified source packages will result in different binary packages for Ubuntu. That’s because Ubuntu has made changes to the build environment. They only support Intel-based computers with a 686-class (or newer) CPU, they enable some compiler options that Debian doesn’t, etc. And all binary packages are modified by a program called pkgbinarymangler.

Ubuntu’s release cycle and the relation with Debian

Ubuntu releases every 6 months (that’s what time based releases is about). Debian has a very different schedule. How does Ubuntu manage to reuse Debian’s work?

Ubuntu imports packages from Debian unstable (even experimental sometimes) to get the newest packages. If the Ubuntu package already has Ubuntu-specific changes, they merge their changes in the updated Debian package. Otherwise the Debian package is simply grabbed and rebuilt in Ubuntu. This works well because Debian unstable is much more usable than the name suggests. And this process only goes on during the first 2 months of the cycle (until the Debian Import Freeze), so there’s plenty of time afterward to fix the biggest problems.

In the third and fourth month, it’s still possible to pick updated packages from Debian but it must be requested by a developer, it won’t be done automatically. At the end of the fourth month, the feature freeze is put in place.

The 2 months left are dedicated to bug fixing and polishing the distribution. There are various sub-freezes that happen in this period, you can check the Natty release schedule as an example. Picking updated packages from Debian is now the exception, it will only be allowed if the update on the Debian side is a bug-fix only release.

Credits: some figures taken from a talk of Lucas Nussbaum, they were collected based on the packages available in the Lucid Lynx release of Ubuntu.

Click here to subscribe to my newsletter and get my monthly update on what’s going on in Debian and Ubuntu.

4 tips to maintain a “3.0 (quilt)” Debian source package in a VCS

Most Debian packages are managed with a version control system (VCS) like git, subversion, bazaar or mercurial. The particularities of the 3.0 (quilt) source format are not without consequences in terms of integration with the VCS. I’ll give you some tips to have a smoother experience.

All the samples given in the article assume that you use git as version control system.

1. Add .pc to the VCS ignore list

.pc is the directory used by quilt to store its internal data (list of applied patches, backup of modified files). It’s also created by dpkg-source so that quilt knows that the patches are in debian/patches (and not in patches which is the default directory used by quilt). For that reason, the directory is kept even if you unapply all the patches.

However you don’t want to store this directory in your repository, so it’s best to put it in the VCS ignore list. With git you simply do:

$ echo ".pc" >>.gitignore
$ git add .gitignore
$ git commit -m "Ignore quilt dir"

The .gitignore file is ignored by dpkg-source, so you’re not adding any noise to the generated source package.

2. Unapply patches after the build

If you store upstream sources with non-applied patches (most people do), and if you don’t build packages in a temporary build directory, then you probably want to unapply the patches after the build so that your repository is again in a clean status.

This is now the default since dpkg-source will unapply any patch that it had to apply by itself. Thus if you start the build with a clean tree, you’ll end up with a clean tree.

But you can still force dpkg-source to unapply patches by adding “unapply-patches” to debian/source/local-options:

$ echo "unapply-patches" >>debian/source/local-options
$ git add debian/source/local-options
$ git commit -m "Unapply patches after build"

svn-buildpackage always builds in a temporary directory so the repository is left exactly like it was before the build, this option is thus useless. git-buildpackage can also be told to build in a temporary directory with --git-export-dir=../build-area/ (the directory ../build-area/ is the one used by svn-buildpackage, so this option makes git-buildpackage behave like svn-buildpackage in that respect).

3. Manage your quilt patches as a git branch

Instead of using quilt to manage the Debian-specific patches, it’s possible to use git itself. git-buildpackage comes with gbp-pq (“Git-BuildPackage Patch Queue”): it can export the quilt serie in a git branch that you can manipulate like you want. Each commit represents a patch, so you want to rebase that branch to edit intermediary commits. Check out the upstream documentation of this tool to learn how to work with it.

There’s an alternative tool as well: git-dpm. Its website explains the principle very well. It’s a more complicated than gbp-pq but it has the advantage of keeping the history of all branches used to generate the quilt series of all Debian releases. You might want to read a review made by Sam Hartman, it explains the limits of this tool.

4. Document how to review the changes

One of the main benefit of this new source format is that it’s easy to review changes because upstream changes are kept as separate patches properly documented (ideally using the DEP-3 format). With the tools above, the commit message becomes the patch header. Thus it’s important to write meaningful commit messages.

This works well as long as your workflow considers the Debian patches as a branch that you rebase on top of the upstream sources at each release. Some maintainers don’t like this workflow and prefer to have the Debian changes applied directly in the packaging branch. They switch to a new upstream version by merging it in their packaging branch. In that case, it’s difficult to generate a quilt serie out of the VCS. Instead, you should instruct dpkg-source to store all the changes in a single patch (which is then similar to the good old .diff.gz) and document in the header of that patch how the changes can be better reviewed, for example in the VCS web interface. You do the former with the --single-debian-patch option and the latter by writing the header in debian/source/patch-header:

$ echo "single-debian-patch" >> debian/source/local-options
$ cat >debian/source/patch-header <<END
This patch contains all the Debian-specific
changes mixed together. To review them
separately, please inspect the VCS history
at http://git.debian.org/?=collab-maint/foo.git

END

Subscribe to this blog by RSS, by email or on Facebook.

Save disk space by excluding useless files with dpkg

Most packages contain files that you don’t need: for example translations in languages that you don’t understand, or documentation that you don’t read. Wouldn’t it be nice if you could get rid of them and save a few megabytes? Good news: since dpkg 1.15.8 you can!

dpkg has two options --path-include=glob-pattern and --path-exclude=glob-pattern that control what files are installed or not. The pattern work the same than what you’re used to on the shell (see the glob(7) manual page).

Passing those options on the command-line would be impractical, so the best way to use them is to put them in a file in /etc/dpkg/dpkg.cfg.d/. Beware, the order of the options does matter: when a file matches several options, the last one makes the decision.

A typical usage is to first exclude a directory and then to re-include parts of that directory that you want to keep. For example if you want to drop gettext translations and translated manual pages except French, you could put this in /etc/dpkg/dpkg.cfg.d/excludes:

# Drop locales except French
path-exclude=/usr/share/locale/*
path-include=/usr/share/locale/fr/*
path-include=/usr/share/locale/locale.alias

# Drop translated manual pages except French
path-exclude=/usr/share/man/*
path-include=/usr/share/man/man[1-9]/*
path-include=/usr/share/man/fr*/*

Note that the files will vanish progressively every time that a package is upgraded. If you want to save space immediately, you have to reinstall the packages present in your system. aptitude reinstall or apt-get --reinstall install might help. In theory with aptitude you can even do aptitude reinstall ~i but it tends to not work because one package is not available (either because it was installed manually or because the installed version has been superseded by a newer version on the mirror).

Found it useful? Click here to see how you can encourage me to provide more articles like this one.

5 reasons why a Debian package is more than a simple file archive

Folder with gearsYou’re probably manipulating Debian packages everyday, but do you know what those files are? This article will show you their bowels… Surely they are more than file archives otherwise we would just use TAR archives (you know those files ending with .tar.gz). Let’s have a look!

1. It’s two TAR file archives in an AR file archive!

A .deb file is actually an archive using the AR format, you can manipulate it with the ar command. This archive contains 3 files, you can check it yourself, download any .deb file and run “ar t” on it:

$ ar t gwibber_2.31.91-1_all.deb
debian-binary
control.tar.gz
data.tar.gz

debian-binary is a text file indicating the version of the format of the .deb file, the current version is “2.0”.

$ ar p gwibber_2.31.91-1_all.deb debian-binary
2.0

data.tar.gz contains the real files of the package, the content of that archive gets installed in your root directory when you run “dpkg --unpack“.

But the most interesting part—which truly makes .deb files more than a file archive—is the last file. control.tar.gz contains meta-information used by the package manager. What are they?

$ ar p gwibber_2.31.91-1_all.deb control.tar.gz | tar tzf -
./
./postinst
./prerm
./preinst
./postrm
./conffiles
./md5sums
./control

2. It contains meta-information defining the package and its relationships

The control file within the control.tar.gz archive is the most fundamental file. It contains basic information about the package like its name, its version, its description, the architecture it runs on, who is maintaining it and so on. It also contains dependency fields so that the package manager can ensure that everything needed by the package is installed before-hand. If you want to learn more about those fields, you can check Binary control files in the Debian Policy.

Those information end up in /var/lib/dpkg/status once the package is installed.

3. It contains maintainer scripts so that everything can just work out of the box

At various steps of the installation/upgrade/removal process, dpkg is executing the maintainer scripts provided by the package:

  • postinst: after installation
  • preinst: before installation
  • postrm: after removal
  • prerm: before removal

Note that this description is largely simplified. In fact the scripts are executed on many other occasions with different parameters. There’s an entire chapter of the Debian Policy dedicated to this topic. But you might find this wiki page easier to grasp: http://wiki.debian.org/MaintainerScripts.

While this looks scary, it’s a very important feature. It’s required to cope with non-backwards compatible upgrades, to provide automatic configuration, to create system users on the fly, etc.

4. Configuration files are special files

Unpacking a file archive overwrites the previous version of the files. This is the desired behavior when you upgrade a package, except for configuration files. You prefer not to loose your customizations, don’t you?

That’s why packages can list configuration files in the conffiles file provided by control.tar.gz. That way dpkg will deal with them in a special way.

5. You can always add new meta-information

And in fact many tools already exploit the possibility to provide supplementary files in control.tar.gz:

  • debsums use the md5sums file to ensure no files were accidentally modified
  • dpkg-shlibdeps uses shlibs and symbols files to generate dependencies on libraries
  • debconf uses config scripts to collect configuration information from the user

Once installed, those files are kept by dpkg in /var/lib/dpkg/info/package.* along with maintainer scripts.

If you want to read more articles like this one, click here to subscribe to my free newsletter. You can also follow me on Identi.ca, Twitter and Facebook.

Managing distribution-specific patches with a common source package

In the comments of the article explaining how to generate different dependencies on Debian and Ubuntu with a common source package, I got asked if it was possible to apply a patch only in some distribution. And indeed it is.

The source package format 3.0 (quilt) has a neat feature for this. Instead of unconditionally using debian/patches/series to look up patches, dpkg-source first tries to use debian/patches/vendor.series (where vendor is ubuntu, debian, etc.). Note that dpkg-source does not stack patches from multiple series file, it uses a single series file, the first that exists.

So what’s the best way to use this? Debian should always provide debian/patches/series, they are supposed to provide the default set of patches to use. Any derivative cooperating with Debian can maintain their own series files within the common VCS repository used for package maintenance. They can drop Debian-specific patches (say branding patches for example), and they can add their own on top of the remaining Debian patches.

It’s worth noting that it’s the job of the maintainers to keep both series files in sync when needed. dpkg-source offers no way to have stacked series files (or dependencies between them).

If you want to use quilt to edit an alternate series file, you can temporarily set the QUILT_SERIES environment variable to “vendor.series”. Just make sure to start from a clean state, i.e. no patches applied. Otherwise quilt will be confused by the sudden mismatch between the series file and its internal data (stored in the .pc directory).

Found it useful? Click here to see how you can encourage me to provide more articles like this one.

Latest features of dpkg-dev: debian packaging tools

I’m attending the mini-Debconf Paris and I just gave a talk about the latest improvement of dpkg-dev—the package providing the basic tools used to build Debian packages. Latest is a bit stretched since it embraces the last 2-3 years of development.

My talk covered the following topics:

  • Support of symbols files by dpkg-shlibdeps, dpkg-gensymbols
  • Support of new source formats by dpkg-source
  • Supplementary options for dpkg-source
  • Cross distribution collaboration with dpkg-vendor
  • Custom compilation flags with dpkg-buildflags
  • Miscellaneous improvements to other tools

The slides are relatively verbose so that you can understand them even if you did not attend the talk. Click here to get the slides.

Related links

This section points to various articles that cover more extensively some of the features mentioned in my talk.

Concerning dpkg-source:

Concerning dpkg-maintscript-helper:

Concerning dpkg-vendor:

What Debian & Ubuntu topics would you like to read about?

A woman enjoying this blogAfter having looked back at the first months of this blog, I also want to look forward and see how I can improve its content. If you’re a Debian/Ubuntu user and/or contributor, I want this blog to be a truly useful resource for you. What kind of articles would you like me to write?

I have lots of ideas but I can’t do everything. I’ll share some of them so that you can discuss them:

  • New in Debian testing: a regular column covering changes affecting testing users.
  • Short presentations of software available in Debian/Ubuntu (like debaday.debian.net used to do).
  • Articles covering wishlist bugs on developers-reference so that they can be easily reused to improve the documentation!
  • Interviews of Debian contributors.
  • Description of small tasks that one can do to start contributing.

Pleases discuss and share your ideas in the comments. Don’t limit yourself to the above list, you know better than me what you need: tell me what kind of documentation was lacking in your daily usage of Debian/Ubuntu, or what could have been better explained while you tried to contribute to Debian/Ubuntu.

While I set no limits on Debian/Ubuntu topics that I accept to cover, my main focus is around documentation for end-users and/or contributors.

If you prefer you can also send your feedback with Identi.ca, Twitter or leave a comment in the entry for this article in my facebook page.

Secret figures of a Debian/Ubuntu blogger: what you liked most on raphaelhertzog.com

Chart goes up on screenI launched raphaelhertzog.com this summer (taking over the English content of my former multi-lingual blog), when I decided that I would be more serious about blogging on Debian/Ubuntu related topics. On September, I decided to write 2 articles per week and up to now I managed to keep the schedule.

Two of my articles were published by Linux Weekly News, those are much more researched than the average blog article (they are tagged with [LWN] in the list below).

The most popular articles

Most people read my blog through the RSS feed which happens to be syndicated on Planet Debian and Planet Ubuntu. According to the feedburner’s statistics, the top-5 articles are:

  1. 5 reasons why I still contribute to Debian after 12 years (32700 views)
  2. [LWN] Understanding Membership Structures in Debian and Ubuntu (31700 views)
  3. Social Micropayment Can Foster Free Software, Discover Flattr (30100 views)
  4. Everything you need to know about conffiles: configuration files managed by dpkg (29900 views)
  5. How to make 110.28 EUR in one month with free software and Flattr (29400 views)

But I also have occasional readers visiting my blog because my articles are announced on Identi.ca, Twitter and Facebook (and they circulate on social networks, thanks to those who are sharing them!). The top-5 articles according to the statistics of my website are:

  1. 5 reasons why I still contribute to Debian after 12 years (6000 views)
  2. [LWN] Can Debian offer a Constantly Usable Testing distribution? (5000 views)
  3. Understanding Debian’s release process (1500 views)
  4. Flattr FOSS (1400 views, not an article but I regularly blog about this project)
  5. Can Debian achieve world domination without being on Facebook? (1100 views)

The most flattered

Since I am using Flattr on my blog, it can be interesting to see the articles which generated lots of flattr micro-donations. The top-3 articles are my articles about Flattr (1, 2, 3). Excluding articles related to Flattr, the top-5 is:

  1. 5 reasons why I still contribute to Debian after 12 years (12 flattr)
  2. The secret plan behind the “3.0 (quilt)” Debian source package format (10 flattr)
  3. How to use multiple upstream tarballs in Debian source packages? (5 flattr)
  4. [LWN] Understanding Membership Structures in Debian and Ubuntu (4 flattr)
  5. Do You Want a Free Debian Book? Read on. (4 flattr)

Most articles get 2 to 3 flattr clicks.

The most commented

I usually get 4-5 comments on most articles but some generate much more feedback:

  1. [LWN] Can Debian offer a Constantly Usable Testing distribution? (40 comments)
  2. 5 reasons why I still contribute to Debian after 12 years (22 comments)
  3. Can Debian achieve world domination without being on Facebook? (15 comments)
  4. How to generate different dependencies on Debian and Ubuntu with a common source package (14 comments)
  5. [LWN] Understanding Membership Structures in Debian and Ubuntu (12 comments)

Factoids

Here are my conclusions based on the above figures:

  • Writing about your Debian/Ubuntu work and your long term involvement makes for highly popular content that spreads well.
  • In-depth and well researched articles (like those written for LWN) do not generate more flattr revenues than the average article even if they take 4 to 8 times as long to write.
  • People are more likely to flattr you for your free software contribution than for the value they get out of your article.
  • People care a lot about the Debian release process, and like to discuss the topic.

If you also appreciate the above-linked articles, you should click here to subscribe to my email newsletter.

Correctly renaming a conffile in Debian package maintainer scripts

After having dealt with the removal of obsolete conffiles, I’ll now explain what you should do when a configuration file managed by dpkg must be renamed.

The problem

Let’s suppose that version 1.2 of the software stopped providing /etc/foo.conf. Instead it provides /etc/bar.conf because the configuration file got renamed. If you do nothing special, the new conffile will be installed with the default configuration, and the old one will stay around. Any customization made by the administrator are lost in the process (in fact they are not lost, they are still in foo.conf but they are unused).

Of course, you could do mv /etc/foo.conf /etc/bar.conf in the pre-installation script. But that’s not satisfactory: it will generate a spurious conffile prompt that the end-user will not understand.

The solution

In the preinst script, you have to verify if the old conffile has been modified by the administrator. If yes, you want to keep the file around. Otherwise you know you will be able to ditch it once the upgrade is over, and you rename it to /etc/foo.conf.dpkg-remove to remember this fact.

In the postinst script, you remove /etc/foo.conf.dpkg-remove. If the old conffile (/etc/foo.conf) still exists, it’s because it was modified by the administrator. You make a backup of the new conffile in /etc/bar.conf.dpkg-dist and rename the old one into /etc/bar.conf.

In the postrm, when called to abort an upgrade, you move /etc/foo.conf.dpkg-remove back to its original name.

In practice, use dpkg-maintscript-helper

dpkg-maintscript-helper can automate all those tasks. You just have to put the following snippet in the maintainer scripts (postinst, postrm, preinst):

if dpkg-maintscript-helper supports mv_conffile 2>/dev/null; then
    dpkg-maintscript-helper mv_conffile /etc/foo.conf /etc/bar.conf 1.1-3 -- "$@"
fi

In this example, I assumed that version 1.1-3 was the last version of the package that contained /etc/foo.conf (i.e. the last version released before 1.2-1 was packaged).

You can avoid the preliminary test if you pre-depend on “dpkg (>= 1.15.7.2)” or if enough time has passed to assume that everybody has a newer version anyway. You can learn all the details in dpkg-maintscript-helper’s manual page.

Found it useful? Be sure to not miss other packaging tips (or lessons), click here to subscribe to my free newsletter and get new articles by email (just check “Send me blog updates”).

The right way to remove an obsolete conffile in a Debian package

A conffile is a configuration file managed by dpkg, I’m sure you remember the introductory article about conffiles. When your package stops providing a conffile, the file stays on disk and it’s recorded as obsolete by the package manager. It’s only removed during purge. If you want the file to go away, you have to remove it yourself within your package’s configuration scripts. You will now learn how to do this right.

When is that needed?

dpkg errs on the side of safety by not removing the file until purge but in most cases it’s best to remove it sooner so as to not confuse the user. In some cases, it’s even required because keeping the file could break the software (for example if the file is in a .d configuration directory, and if it contains directives that are either no longer supported by the new version or in conflict with other new configuration files).

What’s complicated in “rm”?

So you want to remove the conffile. Adding an “rm” command in debian/postinst sounds easy. Except it’s not the right thing to do. The conffile might contain customizations made by the administrator and you don’t want to wipe those. Instead you want to keep the file around so that he can get his changes back and do whatever is required with those.

The correct action is thus to move the file away in the prerm, to ensure it doesn’t disturb the new version. At the same time, you need to verify whether the conffile has been modified by the administrator and remember it for later. In the postinst, you need to remove the file if it’s unmodified, or keep it under a different name that doesn’t interfere with the software. In many cases adding a simple .dpkg-bak suffix is enough. For instance, run-parts ignore files that contain a dot, and many other software are configured to only include files with a certain extension—say *.conf. In the postrm, you have to remove the obsolete conffiles that were kept due to local changes and you should also restore the original conffile in case the upgrade obsoleting the conffile is aborted.

Automating everything with dpkg-maintscript-helper

Phewww… that’s a lot of things to do for a seemingly simple task. Fortunately everything can be automated with dpkg-maintscript-helper. Let’s assume you want to remove /etc/foo/conf.d/bar because it’s obsolete and you’re going to prepare a new version 1.2-1 with the appropriate code to remove the file on upgrade. You just have to put this snippet in the 3 relevant scripts (preinst, postinst, postrm):

if dpkg-maintscript-helper supports rm_conffile 2>/dev/null; then
    dpkg-maintscript-helper rm_conffile /etc/foo/conf.d/bar 1.2-1 -- "$@"
fi

You can avoid the preliminary test if you pre-depend on “dpkg (>= 1.15.7.2)” or if enough time has passed to assume that everybody has a newer version anyway. You can learn all the details in dpkg-maintscript-helper’s manual page.

I hope you found this article helpful. You can follow me on Identi.ca, Twitter and Facebook.