Debian Cleanup Tip #1: Get rid of useless configuration files

If you like to keep your place clean, you probably want to do the same with your computer. I’m going to show you a few tips over the next 4 weeks so that you can keep your Debian/Ubuntu system free of dust!

Over time the set of packages that is installed on your system changes, either because you install and remove stuff, or because the distribution evolved (and you upgraded your system to the latest version).

But the Debian packaging system is designed to keep configuration files when a package is removed. That way if you reinstall it, you won’t have to redo the configuration. That’s a nice feature but what if you will never reinstall those packages?

Then those configuration files become clutter that you would rather get rid of. In some cases, those files lying around might have unwanted side-effects (recent example: it can block the switch to a dependency-based boot sequence because obsolete init scripts without the required dependencies are still present).

The solution is to “purge” all packages which are in the “config-files” state. With aptitude you can do aptitude purge ~c (or aptitude purge ?config-files). Replace “purge” by “search” if you only want to see a list of the affected packages.

If you want a machine-friendly list of the packages in that state, you could use one of those commands (and then pass the result to apt-get if you don’t have aptitude available):

$ grep-status -n -sPackage -FStatus config-files
[...]
$ dpkg-query -f '${Package} ${Status}\n' -W | grep config-files$ | cut -d" " -f1
[...]

Note that grep-status is part of the dctrl-tools package.

Of course you can also use graphical package managers, like Synaptic. Click on the “Status” button on the bottom left, then on “Not installed (residual config)” and you have a list of packages that you can purge. You can select them all, right click and pick “Mark for Complete Removal”. See the screenshot below. The last step is to click on “Apply” to get the packages purged.

Synaptic purging residul config files

Do you want to read more tutorials like this one? Click here to subscribe to my free newsletter, you can opt to receive future articles by email.

3 ways to not clutter your Debian source package with autogenerated files

It’s quite common that the upstream build system generates/updates some files but does not clean them up properly when you call make clean. In that case, when you rebuild the package a second time in the same tree, the generated Debian source package will contain those changes.

You usually don’t want those changes. They make your package harder to review because they contain unneeded modifications (either directly in the .diff.gz with the old source format, or in a new patch in debian/patches/debian-changes-<ver> with the “3.0 (quilt)” source format).

I’ll show you 3 ways to avoid this problem. They are all workarounds, the proper fix would be to improve the upstream build system to really clean up the generated files. This is usually possible for files that are “created”, but it’s much more cumbersome for files that are “updated” (you would have to keep a backup of the original file so that you can restore it).

The traditional fix

Instead of relying on the upstream build system to do the work, we modify the clean target in debian/rules to remove the files that are left-over. Since “debian/rules clean” is always called before a source package is built, those generated files are not included as changes compared to what upstream provided.

A common work-around: always build from a clean state

As you have noted, the problem only happens when you build (source and binaries) twice in a row in the same tree. Some VCS-helper tools always build the Debian package in a temporary tree which is exported from the VCS. This is the case of svn-buildpackage by default and of git-buildpackage if you use its --git-export-dir option.

I don’t like this solution because it solves the problem only for the maintainer. Anyone else who is working on top of the package without using the same VCS-helper tool would be affected by the problem.

A new way to avoid the problem

Since it’s now possible to store dpkg-source options in the source package itself, we can conveniently have everybody use the --extend-diff-ignore option. It tells dpkg-source to ignore some files when checking whether we have made changes to upstream files.

For example if you want to ignore changes made on the files “config.sub”, “config.guess” and “Makefile” you could put this in debian/source/options:

# Don't store changes on autogenerated files
extend-diff-ignore = "(^|/)(config\.sub|config\.guess|Makefile)$"

You need to know a bit about Perl regular expressions since that’s what is used by dpkg-source to match the filenames to exclude.

Note that this approach always works, even when you can’t remove the file. So it saves you having to make a backup of the unmodified file just to be able to restore it before the next build.

Found it useful? Be sure to not miss other packaging tips (or lessons), click here to subscribe to my free newsletter and get new articles by email.

People behind Debian: Michael Vogt, synaptic and APT developer

Michael and his daughter Marie

Michael has been around for more than 10 years and has always contributed to the APT software family. He’s the author of the first real graphical interface to APT—synaptic. Since then he created “software-center” as part of his work for Ubuntu. Being the most experienced APT developer, he’s naturally the coordinator of the APT team. Check out what he has to say about APT’s possible evolutions.

My questions are in bold, the rest is by Michael.

Who are you?

My name is Michael Vogt, I’m married and have two little daughters. We live in Germany (near to Trier) and I work for Canonical as a software developer. I joined Debian as a developer in early 2000 and started to contribute to Ubuntu in 2004.

What’s your biggest achievement within Debian or Ubuntu?

I can not decide on a single one so I will just be a bit verbose.

From the very beginning I was interested in improving the package manager experience and the UI on top for our users. I’m proud of the work I did with synaptic. It was one of the earliest UIs on top of apt. Because of my work on synaptic I got into apt development as well and fixed bugs there and added new features. I still do most of the uploads here, but nowadays David Kalnischkies is the most active developer.

I also wrote a bunch of tools like gdebi, update-notifier, update-manager, unattended-upgrade and software-properties to make the update/install situation for the user easier to deal with. Most of the tools are written in python so I added a lot of improvements to python-apt along the way, including the initial high level “apt” interface and a bunch of missing low-level apt_pkg features. Julian Andres Klode made a big push in this area recently and thanks to his effort the bindings are fully complete now and have good documentation.

My most recent project is software-center. Its aim is to provide a UI strongly targeted for end-users. The goal of this project is to make finding and installing software easy and beautiful. We have a fantastic collection of software to offer and software-center tries to present it well (including screenshots, instant search results and soon ratings&reviews). This builds on great foundations like aptdaemon by Sebastian Heinlein, screenshots.debian.net by Christoph Haas, ddtp.debian.org by Michael Bramer, apt-xapian-index by Enrico Zini and many others (this is what I love about free software, it usually “adds”, rarely “takes away”).

What are your plans for Debian Wheezy?

For apt I would love to see a more plugable architecture for the acquire system. It would be nice to be able to make apt-get update (and the frontends that use this from libapt) be able to download additional data (like debtags or additional index file that contains more end-user targeted information). I also want to add some scripts so that apt (optionally) creates btrfs snapshots on upgrade and provide some easy way to rollback in case of problems.

There is also some interesting work going on around making the apt problem resolver a more plugable part. This way we should be able to do much faster development.

software-center will get ratings&reviews in the upstream branch, I really hope we can get that into Wheezy.

If you could spend all your time on Debian, what would you work on?

In that case I would start with a refactor of apt to make it more robust about ABI breaks. It would be possible to move much faster once this problem is solved (its not even hard, it just need to be done). Then I would add a more complete testsuite.

Another important problem to tackle is to make maintainer scripts more declarative. I triaged a lot of upgrade bug reports (mostly in ubuntu though) and a lot of them are caused by maintainer script failures. Worse is that depending on the error its really hard for the user to solve the problem. There is also a lot of code duplication. Having a central place that contains well tested code to do these jobs would be more robust. Triggers help us a lot here already, but I think there is still more room for improvement.

What’s the biggest problem of Debian?

That’s a hard question :) I mostly like Debian the way it is. What frustrated me in the past were flamewars that could have been avoided. To me being respectful to each other is important, I don’t like flames and insults because I like solving problems and fighting like this rarely helps that. The other attitude I don’t like is to blame people and complain instead of trying to help and be positive (the difference between “it sucks because it does not support $foo” instead of “it would be so helpful if we had $foo because it enables me to let me do $bar”).

For a long time, I had the feeling you were mostly alone working on APT and were just ensuring that it keeps working. Did you also had this feeling and are things better nowadays ?

I felt a bit alone sometimes :) That being said, there were great people like Eugene V. Lyubimkin and Otavio Salvador during my time who did do a lot of good work (especially at release crunch times) and helped me with the maintenance (but got interested in other area than apt later). And now we have the unstoppable David Kalnischkies and Julian Andres Klode.

Apt is too big for a single person, so I’m very happy that especially David is doing superb work on the day-to-day tasks and fixes (plus big project like multiarch and the important but not very thankful testsuite work). We talk about apt stuff almost daily, doing code reviews and discuss bugs. This makes the development process much more fun and healthy. Julian Andres Klode is doing interesting work around making the resolver more plugable and Christian Perrier is as tireless as always when it comes to the translations merging.

I did a quick grep over the bzr log output (including all branch merges) and count around ~4300 total commits (including all revisions of branches merged). Of that there ~950 commits from me plus an additional ~500 merges. It was more than just ensuring that it keeps working but I can see where this feeling comes from as I was never very verbose. Apt also was never my “only” project, I am involved in other upstream work like synaptic or update-manager or python-apt etc). This naturally reduced the time available to hack on apt and spend time doing the important day-to-day bug triage, response to mailing list messages etc.

One the python-apt side Julian Andres Klode did great work to improve the code and the documentation. It’s a really nice interface and if you need to do anything related to packages and love python I encourage you to try it. Its as simple as:

import apt
cache = apt.Cache()
cache["update-manager"].mark_install()
cache.commit()

Of course you can do much more with it (update-manager, software-center and lots of more tools use it). With “pydoc apt” you can get a good overview.

The apt team always welcomes contributors. We have a mailing list and a irc channel and it’s a great opportunity to solve real world problems. It does not matter if you want to help triage bugs or write documentation or write code, we welcome all contributors.

You’re also an Ubuntu developer employed by Canonical. Are you satisfied with the level of cooperation between both projects? What can we do to get Ubuntu to package new applications developed by Canonical directly in Debian?

Again a tricky question :) When it comes to cooperation there is always room for improvement. I think (with my Canonical hat on) we do a lot better than we did in the past. And it’s great to see the current DPL coming to Ubuntu events and talking about ways to improve the collaboration. One area that I feel that Debian would benefit is to be more positive about NMUs and shared source repositories (collab-maint and LowThresholdNmu are good steps here). The lower the cost is to push a patch/fix (e.g. via direct commit or upload) the more there will be.

When it comes to getting packages into Debian I think the best solution is to have a person in Debian as a point of contact to help with that. Usually the amount of work is pretty small as the software will have a debian/* dir already with useful stuff in it. But it helps me a lot to have someone doing the Debian uploads, responding to the bugmail etc (even if the bugmail is just forwarded as upstream bugreports :) IMO it is a great opportunity especially for new packagers as they will not have to do a lot of packaging work to get those apps into Debian. This model works very well for me for e.g. gdebi (where Luca Falavigna is really helpful on the Debian side).

Is there someone in Debian that you admire for his contributions?

There are many people I admire. Probably too many to mention them all. I always find it hard to single out individual people because the project as a whole can be so proud of their achievements.

The first name that comes to my mind is Jason Gunthorpe (the original apt author) who I’ve never met. The next is Daniel Burrows who I met and was inspired by. David Kalnischkies is doing great work on apt. From contributing his first (small) patch to being able to virtually fix any problem and adding big features like multiarch support in about a year. Sebastian Heinlein for aptdaemon.

Christian Perrier has always be one of my heroes because he cares so much about i18n. Christoph Haas for screenshots.debian.net, Michael Bramer for his work on debian translated package descriptions.


Thank you to Michael for the time spent answering my questions. I hope you enjoyed reading his answers as I did. Subscribe to my newsletter to get my monthly summary of the Debian/Ubuntu news and to not miss further interviews. You can also follow along on Identi.ca, Twitter and Facebook.

5 reasons why Debian Unstable does not deserve its name

Debian Unstable (also known as sid) is one of the 3 distributions that Debian provides (along with Stable and Testing).

It’s not conceived as a product for end-users, instead it’s the place where contributors are uploading newer packages. Daily. Yes that means that Unstable is a quickly moving target and it’s not for everybody. But you can use it and your computer won’t explode.

1. It contains mainly stable versions of the software

Yes, you read it right. Unstable is not full of development versions of the various software. It happens on some software but then it’s usually a conscious decision of the maintainer who believes that this specific version is already better than the previous one.

The packages in sid are supposed to migrate to testing, the place where the next Debian stable release is prepared. So maintainers are advised to only upload stuff that is of release quality, the rest should be uploaded to experimental instead.

2. It doesn’t break badly every other day

Breakages happen but they are not a big deal usually. It has been long time since I could not reboot my computer after an upgrade or since the graphical interface was no longer working. The kind of breakages that you have is that one software stops working, or triggers an annoying bug, or that a few packages are uninstallable.

In most cases, you can save yourself by downgrading to the version available in Testing. Or by finding a work-around in the bug tracking system. Or by not upgrading because you have apt-listbugs installed and you have been warned about the problem.

3. It’s the basis of other distributions

If Debian Unstable was really so bad, it would not be a good basis to build a derivative distribution, isn’t it? But Ubuntu and SiduxAptosid (to name only two) are based on Debian Sid.

4. It’s not inherently less secure than Stable or Testing

High impact security vulnerabilities will usually be quickly fixed in Stable and Unstable. The stable upload is done by the security team while the unstable one is made by the maintainer. Testing will usually get the fix through the package uploaded to Unstable, so testing users get security updates with a delay.

For less serious vulnerabilities, it’s entirely possible that stable does not get any update at all. In that case, unstable/testing users are better served since they will get the fix with the next upstream version anyway.

Of course, it happens that maintainers are busy or that something falls through the cracks, but there are other people watching RC bugs who will fix this if the maintainer doesn’t react at all.

5. I use it on my main computer

And many other people do the same. And you can do the same if you meet the criteria below:

  • you can work on the command-line (enough to downgrade a problematic package, to edit configuration files, etc.);
  • you know how to work with APT and multiple distributions in /etc/apt/sources.list;
  • you are able to read/write English so that you can read/file bug reports when needed;
  • you have another computer connected to the Internet that you can use to lookup documentation (or the bug tracking system, or the support mailing lists) when your usual computer is off-line for a reason that you don’t understand.

If you feel you are not ready for the jump, click here to subscribe to this blog (or here via the RSS feed), I’ll surely teach some of the required skills in future articles.

PS: All that said, if you have a working sid installation, do not upgrade it just before an important presentation, or before a trip. It will always break at the most annoying time. Unless you like to live dangerously, of course.

Avoid a newbie packager mistake: don’t build your Debian packages with dpkg -b

In the last years, I have seen many people try to use dpkg --build to create Debian packages. Indeed, if you look up dpkg’s and dpkg-deb‘s manual pages, this option seems to be what you have to use:

-b, --build directory [archive|directory]

Creates a debian archive from the filesystem tree stored in directory. directory must have a DEBIAN subdirectory, which contains the control information files such as the control file itself. This directory will not appear in the binary package’s filesystem archive, but instead the files in it will be put in the binary package’s control information area.

And indeed, dpkg-deb is what ultimately creates the .deb files (aka binary packages). But it’s a low-level tool that you should not call yourself. If you want to properly package a new software, you should rather create a Debian source package that will transform upstream source code into policy-compliant binary packages.

Creating a source package also involves preparing a directory tree (but with a “debian” sub-directory), it’s probably more complicated than calling dpkg -b on a manually crafted directory. But the result is much more versatile: the tools used bring value by dynamically analyzing/modifying the files within your package (for example, the dependencies on C libraries that your package needs are automatically inserted).

If this is news to you, you might want to check out the New Maintainers’ Guide and the Debian Policy.

Found it useful? Be sure to not miss other packaging tips (or lessons), click here to subscribe to my free newsletter and get new articles by email.

Howto to rebuild Debian packages

Being able to rebuild an existing Debian package is a very useful skill. It’s a prerequisite for many tasks that an admin might want to perform at some point: enable a feature that is disabled in the official Debian package, rebuild a source package for another suite (for example build a Debian Testing package for use on Debian Stable, we call that backporting), include a bug fix that upstream developers prepared, etc. Discover the 4 steps to rebuild a Debian package.

1. Download the source package

The preferred way to download source packages is to use APT. It can download them from the source repositories that you have configured in /etc/apt/sources.list, for example:

deb-src http://ftp.debian.org/debian unstable main contrib non-free
deb-src http://ftp.debian.org/debian testing main contrib non-free
deb-src http://ftp.debian.org/debian stable main contrib non-free

Note that the lines start with “deb-src” instead of the usual “deb”. This tells APT that we are interested in the source packages and not in the binary packages.

After an apt-get update you can use apt-get source publican to retrieve the latest version of the source package “publican”. You can also indicate the distribution where the source package must be fetched with the syntax “package/distribution“. apt-get source publican/testing will grab the source package publican in the testing distribution and extract it in the current directory (with dpkg-source -x, thus you need to have installed the dpkg-dev package).

$ apt-get source publican/testing
Reading package lists... Done
Building dependency tree       
Reading state information... Done
NOTICE: 'publican' packaging is maintained in the 'Git' version control system at:
git://git.debian.org/collab-maint/publican.git
Need to get 727 kB of source archives.
Get:1 http://nas/debian/ squeeze/main publican 2.1-2 (dsc) [2253 B]
Get:2 http://nas/debian/ squeeze/main publican 2.1-2 (tar) [720 kB]
Get:3 http://nas/debian/ squeeze/main publican 2.1-2 (diff) [4728 B]
Fetched 727 kB in 0s (2970 kB/s)  
dpkg-source: info: extracting publican in publican-2.1
dpkg-source: info: unpacking publican_2.1.orig.tar.gz
dpkg-source: info: unpacking publican_2.1-2.debian.tar.gz
$ ls -dF publican*
publican-2.1/                 publican_2.1-2.dsc
publican_2.1-2.debian.tar.gz  publican_2.1.orig.tar.gz

If you don’t want to use APT, or if the source package is not hosted in an APT source repository, you can download a complete source package with dget -u dsc-url where dsc-url is the URL of the .dsc file representing the source package. dget is provided by the devscripts package. Note that the -u option means that the origin of the source package is not verified before extraction.

2. Install the build-dependencies

Again APT can do the grunt work for you, you just have to use apt-get build-dep foo to install the build-dependencies for the last version of the source package foo. It supports the same syntactic sugar than apt-get source so that you can run apt-get build-dep publican/testing to install the build-dependencies required to build the testing version of the publican source package.

If you can’t use APT for this, enter the directory where the source package has been unpacked and run dpkg-checkbuilddeps. It will spit out a list of unmet build dependencies (if there are any, otherwise it will print nothing and you can go ahead safely). With a bit of copy and paste and a “apt-get install” invocation, you’ll install the required packages in a few seconds.

3. Do whatever changes you need

I won’t detail this step since it depends on your specific goal with the rebuild. You might have to edit debian/rules, or to apply a patch.

But one thing is sure, if you have made any change or have recompiled the package in a different environment, you should really change its version number. You can do this with “dch --local foo” (again from the devscripts package), replace “foo” by a short name identifying you as the supplier of the updated version. It will update debian/changelog and invite you to write a small entry documenting your change.

4. Build the package

The last step is also the simplest one now that everything is in place. You must be in the directory of the unpacked source package.
Now run either “debuild -us -uc” (recommended, requires the devscripts package) or directly “dpkg-buildpackage -us -uc”. The “-us -uc” options avoid the signature step in the build process that would generate a (harmless) failure at the end if you have no GPG key matching the name entered in the top entry of the Debian changelog.

$ cd publican-2.1
$ debuild -us -uc
 dpkg-buildpackage -rfakeroot -D -us -uc
dpkg-buildpackage: export CFLAGS from dpkg-buildflags (origin: vendor): -g -O2
dpkg-buildpackage: export CPPFLAGS from dpkg-buildflags (origin: vendor): 
dpkg-buildpackage: export CXXFLAGS from dpkg-buildflags (origin: vendor): -g -O2
dpkg-buildpackage: export FFLAGS from dpkg-buildflags (origin: vendor): -g -O2
dpkg-buildpackage: export LDFLAGS from dpkg-buildflags (origin: vendor): 
dpkg-buildpackage: source package publican
dpkg-buildpackage: source version 2.1-2rh1
dpkg-buildpackage: source changed by Raphaël Hertzog 
 dpkg-source --before-build publican-2.1
dpkg-buildpackage: host architecture i386
[...]
dpkg-deb: building package `publican' in `../publican_2.1-2rh1_all.deb'.
 dpkg-genchanges  >../publican_2.1-2rh1_i386.changes
dpkg-genchanges: not including original source code in upload
 dpkg-source --after-build publican-2.1
dpkg-buildpackage: binary and diff upload (original source NOT included)
Now running lintian...
Finished running lintian.

The build is over, the updated source and binary packages have been generated in the parent directory.

$ cd ..
$ ls -dF publican*
publican-2.1/                    publican_2.1-2rh1.dsc
publican_2.1-2.debian.tar.gz     publican_2.1-2rh1_i386.changes
publican_2.1-2.dsc               publican_2.1-2rh1_source.changes
publican_2.1-2rh1_all.deb        publican_2.1.orig.tar.gz
publican_2.1-2rh1.debian.tar.gz

Do you want to read more tutorials like this one? Click here to subscribe to my free newsletter, you can opt to receive future articles by email.

State of the Debian-Ubuntu relationship

Debian welcoming contributions from derivatives

The relationship between Debian and Ubuntu has been the subject of many vigorous debates over the years, ever since Ubuntu’s launch in 2004. Six years later, the situation has improved and both projects are communicating better. The Natty Narwhal Ubuntu Developer Summit (UDS) featured—like all UDS for more than 2 years—a Debian Health Check session where current cooperation issues and projects are discussed. A few days after that session, Lucas Nussbaum gave a talk during the mini-Debconf Paris detailing the relationship between both projects, both at the technical and social level. He also shared some concerns for Debian’s future and gave his point of view on how Debian should address them. Both events give valuable insights on the current state of the relationship.

Lucas Nussbaum’s Debian-Ubuntu talk

Lucas started by introducing himself. He’s an Ubuntu developer since 2006 and a Debian developer since 2007. He has worked to improve the collaboration between both projects, notably by extending the Debian infrastructure to show Ubuntu-related information. He attended conferences for both projects (Debconf, UDS) and has friends in both communities. For all of these reasons, he believes himself to be qualified to speak on this topic.

Collaboration at the technical level

He then quickly explained the task of a distribution: taking upstream software, integrating it in standardized ways, doing quality assurance on the whole, delivering the result to users, and assuring some support afterward. He pointed out that in the case of Ubuntu, the distribution has one special upstream: Debian.

Indeed Ubuntu gets most of its software from Debian (89%), and only 7% are new packages coming from other upstream projects (the remaining 4% are unknown, they are newer upstream releases of software available in Debian but he was not able to find out whether the Debian packaging had been reused or not). From all the packages imported from Debian, 17% have Ubuntu-specific changes. The reasons for those changes are varied: bugfixes, integration with Launchpad/Ubuntu One/etc., or toolchain changes. The above figures are based on Ubuntu Lucid (10.04) while excluding many Ubuntu-specific packages (language-pack-*, language-support-*, kde-l10n-*, *ubuntu*, *launchpad*).

The different agendas and the differences in philosophy (Debian often seeking perfect solutions to problems; Ubuntu accepting temporary suboptimal workarounds) also explain why so many packages are modified on the Ubuntu side. It’s simply not possible to always do the work in Debian first. But keeping changes in Ubuntu requires a lot of work since they merge with Debian unstable every 6 months. That’s why they have a strong incentive to push changes to upstream and/or to Debian.

There are 3 channels that Ubuntu uses to push changes to Debian: they file bug reports (between 250 to 400 during each Ubuntu release cycle), they interact directly with Debian maintainers (often the case when there’s a maintenance team), or they do nothing and hope that the Debian maintainer will pick up the patch directly from the Debian Package Tracking System (it relays information provided by patches.ubuntu.com).

Lucas pointed out that those changes are not the only thing that Debian should take back. Ubuntu has a huge user base resulting in lots of bug reports sitting in Launchpad, often without anyone taking care of them. Debian maintainers who already have enough bugs on their packages are obviously not interested in even more bugs, but those who are maintaining niche packages, with few reports, might be interested by the user feedback available in Launchpad. Even if some of the reports are Ubuntu-specific, many of them are advance warnings of problems that will affect Debian later on, when the toolchain catches up with Ubuntu’s aggressive updates. To make this easier for Debian maintainers, Lucas improved the Debian Package Tracking System so that they can easily get Ubuntu bug reports for their packages even without interacting with Launchpad.

Human feelings on both sides

Lucas witnessed a big evolution in the perception of Ubuntu on the Debian side. The initial climate was rather negative: there were feelings of its work being stolen, claims of giving back that did not match the observations of the Debian maintainers, and problems with specific Canonical employees that reflected badly on Ubuntu as a whole. These days most Debian developers find something positive in Ubuntu: it brings a lot of new users to Linux, it provides something that works for their friends and family, it brings new developers to Debian, and it serves as a technological playground for Debian.

On the Ubuntu side, the culture has changed as well. Debian is no longer so scary for Ubuntu contributors and contributing to Debian is The Right Thing to do. More and more Ubuntu developers are getting involved in Debian as well. But at the package level there’s not always much to contribute, as many bugfixes are only temporary workarounds. And while Ubuntu’s community follows this philosophy, Canonical is a for-profit company that contributes back mainly when it has compelling reasons to do so.

Consequences for Debian

In Lucas’s eyes, the success of Ubuntu creates new problems. For many new users Linux is a synonym for Ubuntu, and since much innovation happens in Ubuntu first, Debian is overshadowed by its most popular derivative. He goes as far as saying that because of that “Debian becomes less relevant”.

He went on to say that Debian needs to be relevant because the project defends important values that Ubuntu does not. And it needs to stay as an independent partner that filters what comes out of Ubuntu, ensuring that quality prevails in the long term.

Fixing this problem is difficult, and the answer should not be to undermine Ubuntu. On the contrary, more cooperation is needed. If Debian developers are involved sooner in Ubuntu’s projects, Debian will automatically get more credit. And if Ubuntu does more work in Debian, their work can be showcased sooner in the Debian context as well.

The other solution that Lucas proposed is that Debian needs to communicate on why it’s better than Ubuntu. Debian might not be better for everybody but there are many reasons why one could prefer Debian over Ubuntu. He listed some of them: “Debian has better values” since it’s a volunteer-based project where decisions are made publicly and it has advocated the free software philosophy since 1993. On the other hand, Ubuntu is under control of Canonical where some decisions are imposed, it advocates some proprietary web services (Ubuntu One), the installer recommends adding proprietary software, and copyright assignments are required to contribute to Canonical projects.

Debian is also better in terms of quality because every package has a maintainer who is often an expert in the field of the package. As a derivative, Ubuntu does not have the resources to do the same and instead most packages are maintained on a best effort basis by a limited set of developers who can’t know everything about all packages.

In conclusion, Lucas explained that Debian can neither ignore Ubuntu nor fight it. Instead it should consider Ubuntu as “a chance” and should “leverage it to get back in the center of the FLOSS ecosystem”.

The Debian health check UDS session

While this session has existed for some time, it’s only the second time that a Debian Project Leader was present at UDS to discuss collaboration issues. During UDS-M (the previous summit), this increased involvement from Debian was a nice surprise to many. Stefano Zacchiroli—the Debian leader—collected and shared the feedback of Debian developers and the session ended up being very productive. Six months later is a good time to look back and verify if decisions made during UDS-M (see blueprint) have been followed through.

Progress has been made

On the Debian side, Stefano set up a Derivatives Front Desk so that derivative distributions (not just Ubuntu) have a clear point of contact when they are trying to cooperate but don’t know where to start. It’s also a good place to share experiences among the various derivatives. In parallel, a #debian-ubuntu channel has been started on OFTC (the IRC network used by Debian). With more than 50 regulars coming from both distributions, it’s a good place for quick queries when you need advice on how to interact with the distribution that you’re not familiar with.

Ubuntu has updated its documentation to prominently feature how to cooperate with Debian. For example, the sponsorship process documentation explains how to forward patches both to the upstream developers and to Debian. It also recommends ensuring that the patch is not Ubuntu-specific and gives some explanation on how to do it (which includes checking against a list of common packaging changes made by Ubuntu). The Debian Derivative Front Desk is mentioned as a fallback when the Debian maintainer is unresponsive.

While organizing Ubuntu Developer Week, Ubuntu now reaches out to Debian developers and tries to have sessions on “working with Debian”. Launchpad has also been extended to provide a list of bugs with attached patches and that information has been integrated in the Debian Package Tracking system by Lucas Nussbaum.

Still some work to do

Some of the work items have not been completed yet: many Debian maintainers would like a simpler way to issue a sync request (a process used to inject a package from Debian into Ubuntu). There’s a requestsync command line tool provided by the ubuntu-dev-tools package (which is available in Debian) but it’s not yet usable because Launchpad doesn’t know the GPG keys of Debian maintainers.

Another issue concerns packages which are first introduced in Ubuntu. Most of them have no reason to be Ubuntu-specific and should also end up in Debian. It has thus been suggested that people packaging new software for Ubuntu also upload them to Debian. They could however immediately file a request for adoption (RFA) to find another Debian maintainer if they don’t plan to maintain it in the long term. If Ubuntu doesn’t make this effort, it can take a long time until someone decides to reintegrate the Ubuntu package into Debian just because nobody knows about it. This represents an important shift in the Ubuntu process and it’s not certain that it’s going to work out. As with any important policy change, it can take several years until people are used to it.

Both issues have been rescheduled for this release cycle, so they’re still on the agenda.

This time the UDS session was probably less interesting than the previous one. Stefano explained once more what Debian considers good collaboration practices: teams with members from both distributions, and forwarding of bugs if they have been well triaged and are known to apply to Debian. He also invited Ubuntu to discuss big changes with Debian before implementing them.

An interesting suggestion that came up was that some Ubuntu developers could participate in Debcamp (one week hack-together before Debconf) to work with some Debian developers, go through Ubuntu patches, and merge the interesting bits. This would nicely complement Ubuntu’s increased presence at Debconf: for the first time, community management team member Jorge Castro was at DebConf 10 giving a talk on collaboration between Debian and Ubuntu.

There was also some brainstorming on how to identify packages where the collaboration is failing. A growing number of Ubuntu revisions (identified for example by a version like 1.0-1ubuntu62) could indicate that no synchronization was made with Debian, but it would also identify packages which are badly maintained on the Debian side. If Ubuntu consistently has a newer upstream version compared to Debian, it can also indicate a problem: maybe the person maintaining the package for Ubuntu would be better off doing the same work in Debian directly since the maintainer is lagging or not doing their work. Unfortunately this doesn’t hold true for all packages since many Gnome packages are newer in Ubuntu but are actively maintained on both sides.

Few of those discussions led to concrete decisions. It seems most proponents are reasonably satisfied with the current situation. Of course, one can always do better and Jono Bacon is going to ensure that all Canonical teams working on Ubuntu are aware of how to properly cooperate with Debian. The goal is to avoid heavy package modifications without coordination.

Conclusion

The Debian-Ubuntu relationships used to be a hot topic, but that’s no longer the case thanks to regular efforts made on both sides. Conflicts between individuals still happen, but there are multiple places where they can be reported and discussed (#debian-ubuntu channel, Derivatives Front Desk at derivatives@debian.org on the Debian side or debian@ubuntu.com on the Ubuntu side). Documentation and infrastructure are in place to make it easier for volunteers to do the right thing.

Despite all those process improvements, the best results still come out when people build personal relationships by discussing what they are doing. It often leads to tight cooperation, up to commit rights to the source repositories. Regular contacts help build a real sense of cooperation that no automated process can ever hope to achieve.

This article was first published in Linux Weekly News. You can get my monthly summary of the Debian/Ubuntu news, all you have to do is to click here to subscribe to my free newsletter.

Go2Linux interviewed me: the biggest problem of Debian

Guillermo Garron of Go2Linux enjoys a lot the “People behind Debian” interviews that I make. That’s why he interviewed me (with somewhat similar questions) and published the result on his blog.

Click here to read the full interview. I speak of my Debian projects, of the Ubuntu-Debian relationship, and more.

The question that I would like to highlight is “What’s the biggest problem of Debian?”. I answered this:

Our project identity is somewhat minimalistic. It evolves around the social contract and the Debian Free Software Guidelines. Both documents answer the question of what we’re doing but we lack a clear answer to the question of how we’re supposed to work towards our goals. It would be great if Debian could agree on some principles concerning topics like goal setting, collaboration, team work, politeness, respect. We could then advertise those and build on them while recruiting volunteers.

PS: The interview also hit LinuxToday.

How to find the right Debian packages: high-level search interface

The Debian archive is known to be one of the largest software collections available in the free software world. With more than 16,000 source packages and 30,000 binary packages, users sometimes have trouble finding packages that are relevant to them. Debian developer Enrico Zini has been working on infrastructure to solve this problem. During the recent mini-debconf Paris, Enrico gave a talk presenting what he has been working on in the last few years, which “hasn’t gotten yet the attention it deserves”.

Enrico is known in the Debian community for the introduction of debtags, a system used to classify all packages using facets. Each facet describes a specific kind of property: type of user-interface, programming language it’s written in, type of document manipulated, purpose of the software, etc. His most recent work builds on that. It is available in Debian and Ubuntu in the apt-xapian-index package. Its purpose is to allow advanced queries over the database of available packages.

Users of apt-xapian-index

He started by presenting some early users of the infrastructure. The most widely know is Ubuntu’s software center. Its search feature provides results almost instantly thanks to apt-xapian-index. But it is a very simple interface that doesn’t exploit many of the advanced features provided by the apt-xapian-index.

Another early adopter, making use of some more advanced features, is GoPlay!. It’s a graphical user interface to find games. It makes use of debtags to classify games so that you can browse, for example, all 3D action/arcade games related to cars. GoPlay has even been extended to be a more generic debtags based package browser and the package now also provides GoLearn!, GoAdmin!, GoNet!, GoOffice!, GoSafe!, and GoWeb!.

Fuss-launcher is an application launcher and not a package browser, but by using apt-xapian-index, it’s able to reuse information provided at the package level to make it easier to find installed applications. Package descriptions tend to be more verbose than those embedded in .desktop files. Enrico also showed another nice feature to the audience: if you drag a document onto its window, it will show you a list of applications that can open it.

Last but not least, apt-xapian-index provides a command line search tool that is vastly superior to the traditional apt-cache search: it’s axi-cache search (axi stands for apt-xapian-index). Enrico compared the output of a search on the letter “r”. While apt-cache spits out an infinite list of packages containing this letter somewhere in the description, axi-cache only listed packages related to GNU R. He also demonstrated the contextual tab completion. It makes it easy to use debtags and to refine your search. Once you have typed a first keyword, the tab-completion for the second one only contains keywords or debtags that are actually able to provide more restrictive results. Advanced queries with logical operations (AND, OR, NOT, XOR) are also supported.

Features of the backend

Enrico then dived into the internals. Xapian’s search engine is at the root of this infrastructure. He likes it because it’s a simple library (i.e. no daemon) and it has nice Python bindings. While apt-xapian-index’s core work is to index the descriptions of all the packages, it actually stores much more and can be easily extended with plugins (written in Python).

For instance, the information stored encompasses:

  • words appearing in the description of the packages (including the translated descriptions if the user uses a non-English locale);
  • their origin;
  • their section;
  • their size and installed size;
  • the time they have been first seen;
  • icons, categories, descriptions from the .desktop files they contain (through app-install-data);
  • aliases for names of some popular applications that are not available on Linux (for instance “excel” maps to the debtag office::spreadsheet).

He already has plans to store more: adding popularity contest data (see wishlist bugs #602180 and #602182) will make it possible to sort query results in a useful way. The most widely used applications are good choices when it comes to community support, and they are likely of better quality due to the larger user base. Adding timestamps of the last installation/upgrade/removal, will make it easier to pin-point a regression to a specific package update.

The generated index is world-readable and can be used from any application provided it can use the Xapian library—which is written in C++ but has bindings for Perl, Python, PHP, Java, Tcl, C#, and Ruby.

Call for experimentation

Enrico believes that many useful applications have yet to be invented on top of apt-xapian-index’s features. He’s calling for experimentation and asking for new ideas. The only practical limit that he has encountered is the size of the index: currently it varies between 50 Mb (Debian unstable without translation) and 70 Mb (Debian stable/testing/unstable with one translation). He would like it to not grow over 100 Mb since it’s installed by default (due to aptitude recommending it) and he’s not comfortable with the idea of using more than 20% of the disk footprint of a basic install just for this service. That’s why the index was configured to not store the position of the terms: it’s thus not possible to find out packages whose description contains the word “statistical” immediately followed by the word “computing”. You can however find those which have both terms somewhere in their description.

Enrico wondered if apt-xapian-index offers too much freedom. That could explain why few people experimented with it despite his numerous blog posts with code samples and information on how to get started using it. But it’s not difficult to imagine use cases for this data. It could be used to extend tools like rc-alert or wnpp-alert, for example. They provide a long list of Debian packages that are looking for some help and are installed on the machine. With apt-xapian-index, it would be possible to restrict the results to the set of packages written in a specific programming language or for a particular desktop environment.

The more likely explanation is that too few people know about the tool. There are many more itches to scratch where apt-xapian-index’s features could be very useful, and my guess is that Enrico’s wishes will eventually come true.

This article was first published in Linux Weekly News. Click here to subscribe to my newsletter and keep up with the Debian/Ubuntu news thanks to my monthly digest.

People behind Debian: Colin Watson, the tireless man-db maintainer and a debian-installer developer

Colin Watson is not a high-profile Debian figure, you rarely see him on mailing lists but he cares a lot about Debian and you will see him on Debconf videos sharing many thoughtful comments. I have the pleasure to work with him on dpkg as he maintains the package in Ubuntu, but he does a lot of more interesting things. I also took the opportunity to ask some Ubuntu specific questions since he’s worked for Canonical since the start. Read on.

My questions are in bold, the rest is by Colin.

Who are you?

Hi. I’m 32 years old, grew up in Belfast in Northern Ireland, but have been living in Cambridge, England, since I was 18. I’m married with a stepson and a daughter.

I became interested in Debian due to the critical mass of Debian work happening in Cambridge at the time (and perhaps more immediately because my roommate was running Debian: “hey, what’s that?”), started doing random bits of development in 2000, and joined as a developer in 2001 (a really exciting time, with lots of new people joining who became integral parts of the project). I’d only really been intending to do QA work and various bits of packaging around the edges, and maybe some work on the BTS, but then Fabrizio Polacco died and I took over man-db from him, and it sort of snowballed from there.

I graduated from university shortly before becoming a Debian developer. I worked for a web server company (Zeus), then a hardware cryptography company (nCipher), before moving to work for Canonical in 2004, since when I’ve been working full-time on Ubuntu. By this point, I suspect that going back to work in an office every day would be pretty tough.

What’s your biggest achievement within Debian or Ubuntu?

One thing I should say: I rarely start projects. Firstly I don’t think I’m very good at it, and secondly I much prefer coming on to an existing project and worrying away at all the broken bits, often after other people have got bored and wandered off to the next new and shiny thing. That’s probably why I ended up in the GNU/Linux distribution world in the first place, rather than doing lots of upstream development from the start – I like being able to polish things into a finished product that we can give to end users.

So, I’ve had my fingers in a lot of pies over the years, doing ongoing maintenance and fixing lots of bugs. I think the single project I’m most proud of would have to be my work on the Debian installer. I joined that team in early 2004 (a few months before Canonical started up) partly because I was a release assistant at the time and it was an obvious hot-spot, and partly because I thought it’d be a good idea to make sure it worked well on the shiny new G4 PowerBook I’d just treated myself to. I ended up as one of the powerpc d-i port maintainers for a while (no longer, as that machine is dead), but I’ve done a lot of core work as well: much of the work to put progress bars in front of absolutely everything that used to have piles of text output, rescue mode, the current kernel selection framework, a good deal of udev support, several significant debconf extensions, lots of os-prober work, and I think I can claim to be one of the few people who understands the partitioner almost top to bottom. :-)

d-i is the very first thing many of our users see, and has a huge range of uses, from simple desktop installs to massive corporate deployments; it’s unspeakably important that it works well, and it’s a testament to its design that it’s been able to trundle along without actually very much serious refactoring for the best part of five years now.

I have a soft spot for man-db too. It was my first major project in Debian, starting out from an embarrassingly broken state, and is now nice and stable to the point where I recently had time to spawn a useful generic library out of it (libpipeline).

What are your plans for Debian Wheezy?

d-i has a lot of code to deal with disks and partitions. Of course a lot of it is in the partitioner, and for that we use libparted so we don’t have to worry very much about the minutiae of device naming. But there are several other cases where we do need to care about naming, mainly before the partitioner when detecting disks, and after the partitioner when installing the boot loader. Back in etch, we introduced ‘list-devices’, which abstracted away the disk naming assumptions involved in hardware detection. In wheezy, I would like to take all the messy, duplicated, and error-prone code that handles disk naming in the boot loader installers, and design a simple interface to cover all of them. This has only got more important following the addition of the kFreeBSD and Hurd d-i ports in squeeze, but it bites us every time we notice that, say, CCISS arrays aren’t handled consistently, and it’s a pain to test all that duplicated code.

I’d also like to spread the use of libpipeline through C programs in the archive, which I think has potential to eliminate a class of security vulnerabilities in a much simpler way than was previously available.

If you could spend all your time on Debian, what would you work on?

I would love to systematically reduce the need for the current mass of boot loaders. There’s a significant cost to having so much variation across architectures here: it’s work that needs to be done in N different places, the wildly differing configuration means that d-i has to have huge piles of code to manage them all differently, and there are a bunch of strange arbitrary limitations on what you can do.

The reason I’m working on GRUB 2 is that, in my view, it’s the project with the best chance of centralising all this duplicated work into a single place, and making it easier to bring up new hardware in future (in a way that doesn’t compromise software freedom, as many proprietary boot loaders of the kind often found on phones do). Of course, with flexibility tends to come complexity, and some people have a natural objection to that and prefer something simpler. The things I don’t quite have time to do here are to figure out a coherent way to address the specific over-complexity problems people have with the configuration framework while still keeping the flexibility we need, and to do enough QA and porting work to be able to roll out GRUB 2 at installation time to all the Debian architectures it theoretically supports.

What’s the biggest problem of Debian?

Backbiting, and too much playing the man rather than the ball. With one or two honourable exceptions, I’ve largely stopped reading most Debian mailing lists since it just never seems a productive way to spend time compared to writing code and fixing bugs; and yet I’m conscious that they’re one of the primary means of communication for the project and I’m derelict in not taking part in them.

I do find it a bit frustrating that people are seen primarily in terms of their affiliations. I suppose it’s natural for people to see me as an “Ubuntu guy”, but I don’t really see myself that way: I’ve been working on Debian for nearly twice as long as I’ve been working on Ubuntu, and, while I care a great deal about both projects, I’ve put far more of my own personal time into Debian and I try to make sure that a decent number of the things I’m involved with there aren’t to do with work. Work/life separation is a good thing, not that I’m very good at it. Generally speaking, when I’m working on Debian, I’m doing so as a Debian developer, because I want Debian to be better. When that’s not the case, if it matters, I try to indicate it explicitly.

You’re working for Canonical since Ubuntu’s inception. If you were Mark Shuttleworth, is there something that you would have done differently?

We had many good intentions when we founded Ubuntu. We also had a huge amount of work to deliver, to the point where it wasn’t at all clear whether it would be possible (the warty release was named based on the expectations of it, after all, and came out much more usable than we’d dared to hope). In hindsight, it might have helped to be quieter about our good intentions, so that we could exceed expectations rather than in some cases failing to meet them. That might have set a very different tone early on.

(Personally, I’m happy I’m not Mark. The decisions in my office are much easier to take.)

It seems to me that the community part of Ubuntu is much more eager to cooperate with Debian than the corporate part. It’s probably just that more and more Canonical employees are not former Debian contributors. Do you also have this feeling? Are there processes in place to ensure everybody at Canonical is trying to do the right thing towards Debian cooperation?

Just to be clear, I’m wearing my own hat here—which, ironically, is a fedora—rather than a company hat.

It makes sense for Canonical to be taking on more non-Debian folks; after all, we can’t simply hire from the Debian community forever, and a variety of backgrounds is healthy. As you say, it may well be natural that Ubuntu developers who don’t work for Canonical are more likely to have a Debianish background, as it tends to take something significant to get people to switch to a very different family of GNU/Linux distributions, and changing jobs is one of the most obvious of those things.

Certainly, there was a definite sense among the early developers that we were all part of the Debian family and cared about the success of Debian as well. As Ubuntu has developed its own identity, people involved in it now tend to care primarily about the success of Ubuntu. At the same time, pragmatically, it’s still true that getting code changes into Debian is one of the most economical ways to land them; changes made in Debian or upstream land once and tend to stay in place, while changes made only in Ubuntu incur an ongoing merge overhead, which is not at all trivial.

In many ways it’s human nature to try to fulfil your immediate goals in the most direct way possible. If your goal is to deliver changes to Ubuntu users, then it’s natural to concentrate on that rather than looking at the bigger picture (which takes experience). Debian developers often fail to send changes upstream for much the same reason, although there’s more variation there because they’re normally working on Debian of their own volition and thus tend to have wider goals; the economics are more or less parallel though.

Thus, I think the best way to improve things is to make it the path of least resistance for Ubuntu developers to send changes to Debian. We’re already seeing how this works with the Ubuntu MOTU group; if you send a patch for review, or work on merging a package from Debian, very often the response includes “have you sent these changes to Debian?”. We’re working on both streamlining our code review through a regular patch pilot programme and requiring more code review for changes in general, so I think this will be a good opportunity to ask more people to work with Debian when they propose changes to Ubuntu.

For myself, this may be obvious, but I notice that I’m much better at getting changes into Debian when I already have commit access to the Debian package in question. All the work on improving collaborative maintenance in Debian can only help, for Ubuntu as well as for everyone else. It doesn’t make so much difference for large changes that require extensive discussion, but there are lots of small changes too.

Canonical is upstream of many software projects (unity, indicators infrastructure, etc.). Why aren’t those software immediately packaged in Debian? Do you think we can get this to change?

I’m not sure what the right approach is here, particularly as I haven’t been involved with much of that on the Ubuntu side. I suspect it would be helpful to look at this in a similar way to Ubuntu changes in general. It’s understandable that those developers have getting changes into Ubuntu as their first goal. And yet, having code in Debian offers a wider, and often technically adept, audience, and most developers like having their code reach a wider audience even if it’s not their first priority, particularly if that audience is likely to be able to help with finding problems and fixing bugs. It should be seen as something beneficial to both distributions.

The hardest problems will be with things that aren’t merely optional add-ons (which should generally be fairly non-controversial in Debian, given the breadth of the archive in general – the existence of things like bzr and germinate as Debian packages was never a hard question), but which require changes in established packages. For example, gnome-power-manager in Ubuntu is built with application indicator support, and that’s an important part of having a good indicator-based panel: a lot of the point of indicators is consistency. Since I do very little desktop work myself, I don’t know exactly what would be involved in making it possible to choose this system based on a Debian desktop, but I think it’s probably a bit more complicated than just making sure all the new packages exist in Debian too. Obviously you have to start somewhere.

Is there someone in Debian that you admire for his contributions?

Christian Perrier is absolutely tireless and has done superb things for the state of translations in Debian. And Russ Allbery, even aside from his fine ongoing work on policy, Lintian, and Kerberos, is a constant voice of sanity and calmness.

Release management is incredibly hard work, as I know from my own experience, and anyone who can sustain involvement in it for a long period is somebody pretty special. Steve Langasek and I got involved at about the same time but he outlasted me by quite a few years. He deserves some kind of medal for everything he’s done there.


Thank you to Colin for the time spent answering my questions. I hope you enjoyed reading his answers as I did. Subscribe to my newsletter to get my monthly summary of the Debian/Ubuntu news and to not miss further interviews. You can also follow along on Identi.ca, Twitter and Facebook.