Debian Cleanup Tip #5: identify cruft that can be removed from your Debian system

Last week we learned how to identify and restore packages whose files have been corrupted. This time we’ll concentrate ourselves on the non-packaged files…

Non-packaged files

They are files which are not provided by a Debian package, or in other words, files where dpkg --search finds no associated package:

$ dpkg --search /srv/cvs
dpkg-query: no path found matching pattern /srv/cvs

You always have such files on your system, at least all your own files in /home. But many daemons also create files as part of their work (and they are usually stored in /var): internal files for a database server, mail spool for a mail server, etc. Those are normal and you want to leave them alone.

But you might have non-packaged files in /usr and that should not be the case if you install everything from packages. It would thus be useful to be able to list those files in order to detect a software that has been manually installed.

Manually installed software is not a good idea

Such an installation might cause troubles for example by taking precedence over the same software provided in a Debian package. Over time the local installation will not be upgraded while the packaged one will.

The other packages which depend on this software will believe they have the latest version since their dependency is satisfied but in fact they are using the older version since it takes precedence.

So you want to get rid of those? Let’s see how we can find them.

Use cruft to identify non-packaged files

As I explained above, there are many non-packaged files that are legitimate and that you don’t want to remove. That’s why cruft does something more elaborated than a scan of the filesystem and a check of dpkg’s database.

It provides a way for packages to say which files they might legitimately create during run-time and that cruft should not report. And it knows of many such files. But it’s far from exhaustive and definitely not up-to-date.

So you should always take its output with suspicion and consider twice where the file came from. Do not trust it blindly to remove the files… you have been warned.

How to use cruft

You should give it a list of directories to ignore to reduce the noise in the output, for example like this:

$ sudo cruft -d / -r report --ignore /home --ignore /var --ignore /tmp
$ less report
cruft report: mercredi 23 février 2011, 15:45:34 (UTC+0100)

---- missing: ALTERNATIVES ----
        /etc/alternatives/cli-csc.1.gz
        /usr/share/man/man1/cli-csc.1.gz
---- missing: dpkg ----
        /etc/xdg/autostart/gnome-power-manager.desktop
        /usr/lib/libpython2.6_d.so.1.0-gdb.py
        /usr/share/fonts/X11/100dpi
        /usr/share/fonts/X11/75dpi
---- unexplained: / ----
        /boot
        /dev
        /etc/.java
        /etc/.java/.systemPrefs
[...]
        /usr/lib/pymodules/python2.6
        /usr/lib/pymodules/python2.6/.path
        /usr/lib/pymodules/python2.6/Brlapi-0.5.5.egg-info
[...]

Note that it doesn’t traverse filesystems so if your /usr is on another partition than /, you will need to use the option -d "/ /usr" to have it scan both.

Analyze the report

Now you can quietly go through the report that has been generated and decide which files need to be removed or not. The report also contains missing files (files which should exist according to the dpkg database but which are not there) but the bulk of the listing will be in the “unexplained” section: files which are not part of any package (and whose presence is not explained by any other explain script that packages can ship).

Again take this with great suspicion, and you should rather not delete a file if you don’t know it got there in the first place. For instance, on my system it lists many files below /usr/lib/pymodules/ and those are legitimate: they come from Debian packages but they are copied there dynamically from /usr/{lib,share}/pyshared in order to support multiple python versions. If you remove those files, you effectively break your system.

You will also find many .pyc files created by python packages, they are a byte-compiled version of the corresponding .py file. Removing them breaks nothing but you loose a bit of performance.

On the opposite, most of the files below /usr/local/ are likely the result of some manual software installation and those should be safe to remove (if you know that you are not using the corresponding software).

Conclusion: useful but needs work

In summary, you can use cruft to identify non-packaged files and maybe learn a bit more about what got manually installed on the system, but it requires some patience to go through the report as many of the files reported are false positives.

Yes, cruft badly needs supplementary volunteers to cope with the many ways packages legitimately generate non-packaged files. It’s not even complicated work: the package is mostly in shell and in Perl, and /usr/share/doc/cruft/README.gz explains how it all works.

Do you want to read more tutorials like this one? Click here to subscribe to my free newsletter, you can opt to receive future articles by email.

February 2011 wrap up

February has been again a busy month for me. Here’s a quick summary of what I did:

Multi-Arch work

I have spent many days implementing and refining dpkg’s Multi-Arch support with Guillem Jover (dpkg co-maintainer) and Steve Langasek (beta-tester of my code ;-)). Early testers can try what’s in my latest pu/multiarch/snapshot/* branch in my personal git repository.

A Debian DVD shop

I’m always exploring new options to fund my Debian work (besides direct donations) and this month—with the Debian Squeeze release—I saw an opportunity in selling Debian DVD. Nobody provides DVD with included firmwares and quite a few people would like to avoid the SpaceFun theme. So I built unofficial Debian DVDs that integrate firmware and that install a system with the old theme (MoreBlue Orbit). Click here to learn more about my unofficial DVDs.

On my blog

In my “People behind Debian” series, I interviewed Mike Hommey (Iceweasel maintainer) and Maximiliam Attems (member of the kernel team).

I started a “Debian Cleanup Tip” series and already published 4 installments:

For contributors, I wrote two articles: the first gives a set of (suggested) best practices for sponsoring Debian packages and adapted my article as a patch for the Developers Reference. In the second article, I shared some personal advice for people who are considering participating on Debian mailing list: 7 mistakes to avoid when participating to Debian mailing lists.

Click here to subscribe to my free newsletter and get my monthly analysis on what’s going on in Debian and Ubuntu. Or just follow along via the RSS feed, Identi.ca, Twitter or Facebook.

7 mistakes to avoid when participating to Debian mailing lists

You’re eager to start contributing to Debian, your first action is to subscribe to some high-profile mailing lists (like debian-devel and debian-project) to get a feel of the community. You read the mails for a few days and then you find out that you could participate to the discussions, it’s a simple first step after all. True enough.

That said, it’s not as easy as it looks like. There are many mistakes that you should avoid:

  1. Don’t fall in the trap where your mailing list participation is your sole contribution to Debian. If you want people to give credit to your messages, you should already be doing something else for Debian.
  2. Don’t participate more than once a day to a given thread. There are many people subscribed, you should leave room for other people to express their point of view. You can always follow up one day after and reply to several messages at once if you believe you still have something new to add to the discussion.
  3. Don’t reply to off-topic threads. Someone asked a simple question and someone else pointed out that his message was off-topic. Don’t reply, or if you really need to, do it on the correct list or with a private response.
  4. Don’t ask questions unless it’s useful to bring the discussion forward. Development lists are not here to fill the gaps in your knowledge. We already have debian-mentors for this. Furthermore there’s no better way to learn than to find yourself the answers to your questions. :-)
  5. Don’t believe your opinion is so important. We’re all very opinionated and discussions that consist only of contradicting opinions tend to go nowhere. Thus don’t give your opinion unless you can back it up with new facts or another experience.
  6. Don’t participate to all threads. There are surely some topics where you are more knowledgeable than others, participate where you add the most value and leave the others threads to the other experts (and learn by reading them).
  7. Don’t hide your identity. In Debian we like to know each other. Use your real name and not some anonymous nickname. You need to be able to stand up behind your words, otherwise you’re not credible.

I have myself been guilty of several of those when I started… I invite you to follow my recommendations to ensure our mailing lists remain pleasant to read and an effective discussion place.

You should follow me on Identi.ca, Twitter and Facebook. Or subscribe to this blog by RSS or by email.

Discover my Debian DVD shop

After a private launch (with discounted prices) for my newsletter subscribers, it’s now time to open my Debian DVD shop to the public.

I did not want to become yet another DVD reseller, so my DVDs are different and better. Here’s why you want to get one (or more):

  1. it’s easier to install Debian with my DVDs since they provide all the (non-free) firmwares that have been stripped and that you’re supposed to provide on a USB key;
  2. the installed system features the former theme (MoreBlue Orbit) and not SpaceFun (although you can reactivate SpaceFun easily if you prefer it);
  3. 100% of the benefits are reinvested into Debian (90% to fund my Debian work, 10% given back to Debian to fund work meetings)
  4. they are provided in a beautiful DVD case and despite this they are not expensive (between $3.49 and $5.49)

Click here to learn more about my DVD offer.

PS: Click here and join my newsletter to not miss other opportunities.

Debian Cleanup Tip #4: find broken packages and reinstall them

Last week, we learned to get rid of third-party packages, now we’re going one step further: we’ll verify if the files of the installed packages are still exactly like they were when they got installed.

If you’re a tinkerer and hand-edit some files for some quick tests, or if you tend to re-install newer versions of some packages from the sources, you might have overwritten some packaged files and it would be good to be able to detect this (and remedy to the problem). debsums is the tool that makes it possible.

Use debsums to identify modified files

I often use debsums when I take over the maintenance of a Debian server because I want to verify which files have been modified by the former administrator.

Without any argument, debsums is very verbose, it will list every installed file (except configuration files) and tells whether it’s unmodified (“OK”) or not (“FAILED”).

$ sudo debsums
/usr/bin/a2ps                                               OK
[...]

With the --all option, it will verify all files including configuration files. With --config it will verify only the configuration files.

With the --changed option, debsums will only list modified files among those inspected. The following invocation will thus list all files which have been modified on the system and which are not configuration files.

$ sudo debsums --changed
/usr/lib/perl5/AptPkg/Config.pm
/usr/lib/perl5/AptPkg.pm
[...]

Find out the package affected and reinstall it

debsums told me that /usr/lib/perl5/AptPkg.pm was modified. Indeed I remember having manually installed a modified version of that perl module for a quick test.

I find out the affected package with dpkg --search /usr/lib/perl5/AptPkg.pm: it’s libapt-pkg-perl.

Now I just have to reinstall this package to overwrite the modified files with the original ones:

$ sudo aptitude reinstall libapt-pkg-perl
[...]
# Or with apt-get
$ sudo apt-get --reinstall install libapt-pkg-perl
[...]

You might have to repeat the process until debsums no longer reports any modified file.

Do you want to read more tutorials like this one? Click here to subscribe to my free newsletter, you can opt to receive future articles by email.

People behind Debian: Maximilian Attems, member of the kernel team

Maximilian, along with the other members of the Debian kernel team, has the overwhelming job of maintaining the Linux kernel in Debian. It’s one of the largest package and certainly one where dealing with bug reports is really difficult as most of them are hardware-specific, and thus difficult to reproduce.

He’s very enthusiastic and energetic, and does not fear criticizing when something doesn’t please him. You’ll see.

My questions are in bold, the rest is by Maximilian.

Who are you?

My name is Maximilian Attems. I am a theoretical physicist in my last year of PhD at the Technical University of Vienna. My main research area is the early phase of a Quark-Gluon Plasma as produced in heavy ion collisions at the LHC at CERN. I am developing simulations that take weeks on the Vienna Scientific Cluster (in the TOP 500 list). The rest of the lab is much less fancy and boils down to straight intel boxes without any binary blobs or external drivers (although lately we add radeon graphics for decent free 3D). Mathematica and Maple are the rare exceptions to the many dev tools of Debian (LaTeX, editors, git, IDE’s, Open MPI, ..) found at the institute, as those are unfortunately yet unmatched in Free Software for symbolic computations. The lab mostly runs a combination of Debian stable (testing starting from freeze) for desktops and oldstable/stable for servers. Debian is in use for more than 10 years. So people in the institute know some ups and downs of the project. Newcomers like my room neighbors are always surprised how functional a free Debian Desktop is. :)

What’s your biggest achievement within Debian?

Building lots and lots of kernels together with an growing uptake of the officially released linux images.

I joined the Debian kernel team shortly after Herbert Xu departed. I had been upstream Maintainer of the linux-2.6 janitor project for almost a year brewing hundreds of small cleanups with quilt in a tree named kjt for early linux-2.6. In Debian we had lots of fun in sorting out the troubles that the long 2.5 freeze had imposed: Meaning we were sitting on a huge diverging “monolithic” semi-good patchset. It was great fun to prepare 2.6.8 for Sarge with a huge team enthusiastic in shipping something real close to mainline (You have to imagine that back then you had no stable or longterm release nor any useful free tools like git. This involved passing patches around, hand editing them and seeing what the result does.)

From the Sarge install reports a common pattern emerged that the current Debian early userspace was causing lots of boot failures. This motivated me to develop an alternative using the new upstream initramfs features. So I got involved in early userspace. Thanks to large and active development team initramfs-tools got a nice ecosystem. It still tries to be as generic and flexible as possible and thus gains many nice features. Also H. Peter Anvin (hpa) gave me the official co-maintenance of klibc. klibc saw uptake and good patches from Google in the last 2 years. I am proud that the early userspace is working out fairly well these days, meaning you can shuffle discs around and see your box boot.

Later on we focused on 2.6.18 for Etch, which turned out to a be good release and picked up by several other distributions. Only very much later we would see such a “sync” again. With 2.6.26 for Lenny we got somehow unlucky as we just missed the new longterm release by one release. We also pushed for another update very late (during freeze) in the release cycle, which turned out to semi-work as too much things depend on linux-2.6.

For Squeeze 2.6.32 got picked thanks to discussions at Portland Linux Plumbers and it turned out to be a good release picked up by many distributions and external patchsets. The long-term support is going very well. Greg KH is doing a great job in collecting various needed fixes for it. Somehow we had hoped that the Squeeze freeze would start sooner and that the freeze duration would be shorter, since we were ready for a release starting from the actual freeze on. The only real big bastard on the cool 2.6.32 “sync” is Red Hat. Red Hat Enterprise 6.0 is shipping the linux-2.6 2.6.32 in obfuscated form. They released their linux-2.6 as one big tarball clashing with the spirit of the GPL. One can only mildly guess from the changelog which patches get applied. This is in sharp contrast to any previous Red Hat release and has not yet generated the sharp and snide comments in press it deserves. Red Hat should really step back and not make such stupid management moves. Next to them even the semi-maintained Oracle “Unbreakable” 2.6.32 branch looks better: It is git fetchable.

What are your plans and those of the kernel team for Debian Wheezy?

Since 2.6.32 many of the used patches landed upstream or are on the way (speakup, Kbuild Debian specific targets, ..). The proper vfs based unionfs is something we’d be looking forward. We haven’t yet picked the next upstream release we will base Wheezy on, so currently we can happily jump to the most recent ones. There are plans for better interaction with Debian Installer thanks to generating our udebs properly in linux-2.6 source itself. Also we are looking forward to using git as tool of maintenance. We’d hope that this will also allow for even better cross distribution collaboration.

Concerning early userspace I plan to release an initramfs-tools with more generic userspace for the default case and finally also a klibc only for embedded or tuning cases.

What do you like most in Debian?

For one thing I do like the 2 year release cycle. It is not too long to have completely outdated software and on the other hand it gives enough time to really see huge progress from release to release. Also at my institute the software is is recent enough without too much admin overhead. For servers the three years support are a bit short, but on the manageable side.

I do enjoy a lot the testing distribution. For my personal use it is very stable and thus I mainly run testing on my desktop and work boxes. (Occasionally mixing in things from sid for unbreaking transition or newer security fixes).

Debian is independent and not a commercial entity. I think this is its main force and even more important these days. I enjoy using the Debian platform a lot at work thus in return this motivates me to contribute to Debian itself. I also like the fact that we strive for technical correctness.

Is there some recurrent problem that hinders the progress of Debian?

The “New Maintainer process” is a strange way to discourage people to contribute to Debian. It is particularly bureaucratic and a huge waste of time both for the applicant and his manager. It should be completely thrown overboard.

One needs a more scalable approach for trust and credibility that also enhances the technical knowledge for coding and packaging of the applicant.

NM is currently set in stone as any outside critics is automatically rejected. Young and energetic people are crucial for Debian and the long-term viability of the project, this is the reason why I’d consider the “New Maintainer process” as Debian’s biggest problem.

Note from Raphaël Hertzog: I must say I do not share this point of view on the New Maintainer process, I have witnessed lots of improvements lately thanks to the addition of the Debian Maintainer status, and to the fact that a good history of contribution can easily subsume the annoying Tasks & Skills questionnaire.

Another thing I miss is professional graphics’ input both for the desktop theme and the website. I know that effort has been done there lately and it is good to see movement there, but the end result is still lacking.

Another trouble of Debian is its marketing capabilities. It should learn to better sell itself. It is the distribution users want to run and use—not the rebranded copies of itself with lock-in “sugar”. Debian is about choice and it offers plenty of it: it is a great default Desktop.

Linus Torvalds doesn’t find Debian (and/or Ubuntu) a good platform to hack on the kernel. Do you know why and what can we do about this?

The Fedora linux-2.6 receives contributions from several Red Hat employed upstream sub-Maintainers. Thus it typically carries huge patches which are not yet upstream. As a consequence eventual userland troubles get revealed quite quickly and are often seen there first. The cutting edge nature of Fedora rawhide is appealing for many developers.

The usual Debian package division of library development files and the library itself is traditionally an entry barrier for dev on Debian. Debian got pretty easily usable these days, although we could and should again improve a lot more in this sector. Personally I think that Linus hasn’t tried Debian for years.

I have the feeling that the implication of the Debian Kernel team in LKML has been on the rise. Is that true and how do you explain this?

Ben Hutchings is the Nr.1 contributor for 2.6.33. He also is top listed as author of patches on stable 2.6.32. Debian is not listed as organization as many send their linux-2.6 patches from their corporate or personal email address and thus it won’t be attributed to Debian.

There is currently no means to see how many patches get forwarded for the stable tree, but I certainly forwarded more then fifty patches. I was very happy when Greg KH personally thanked me in the 2.6.32.12 release.

In the Squeeze kernel, the firmwares have been stripped and moved into separate packages in the non-free section. What should a user do to ensure his system keeps working?

There is a debconf warning on linux-2.6 installation. It is quite clear that the free linux-2.6 can’t depend on the firmware of the non-free archive (also there is no strict dependency there technically).

On the terminal you’d also see warnings by update-initramfs on the initramfs generation for drivers included in the initramfs.

The debconf warning lists the filename(s) of the missing firmware(s). One can then “apt-cache search” for the firmware package name and install it via the non-free repository. The check runs against the current loaded modules. The match is not 100% accurate for special cases as the one where the device might be handled well by this driver without firmware, but is accurate enough to warrant the warning.

The set of virtualization technologies that the official Debian kernel supports seems to change regularly. Which of the currently available options would you recommend to users who want to build on something that will last?

KVM has been a smooth ride from day zero. It almost got included instantly upstream. The uptake it has is great as it sees both dev from Intel and AMD. Together with libvirt it’s management is easy. Also the performance of virtio is very good.

The linux containers are the thing we are looking forward for enhanced chroots in the Wheezy schedule. They are also manageable by libvirt.

Xen being the “bad outside boy” has an incredible shrinking patchset, thus is fair to expect to see it for Wheezy and beyond. For many it may come a bit late, but for old hardware without relevant CPU it is there.

Many tend to overstate the importance of the virtualization tech. I’d be much more looking forward to the better Desktop support in newer linux-2.6. The Desktop is important for linux and something that is in heavy use. The much better graphics support of the radeon and nouveau drivers:

  • For Nouveau in 32 with 33 drm we could only deliver a first taste meaning something better then rusty nv.
  • For Radeon the support for Evergreen and newer is only been shaping now.

The performance optimizations thanks to dcache scalability work and the neat automatic task-grouping for the CPU scheduler are very promising features for the usability of the linux desktop. Another nice to have feature is the online defrag of ext4 and its faster mkfs. Even cooler would be better scalability in ext4 (This side seems to have seen not enough effort lately).

Is there someone in Debian that you admire for their contributions?

Hans Peter Anvin and Ted Tso are a huge source of deep linux-2.6 knowledge and personal wisdom. I do enjoy all sorts of interactions with them.

Christoph Hellwig with Matthew Wilcox and also William Irwin for setting up the Debian kernel Team.

Several Debian leaders including the previous and the current one for their engagement, which very often happens behind the scene.

The Debian Gnome Team work is great, also the interactions have always been always easy and a pleasure.

Martin Michlmayr and previously Thiemo Seufer do an incredible job in porting Debian on funny and interesting ARM and MIPS boxes. Debian has a lot of upcoming potential on this area. I’m looking forward to other young enthusiastic people in that area.

Colin Watson is bridging Debian and Ubuntu, which is an immense task.

Michael Prokop bases on Debian an excellent recovery boot CD: http://www.grml.org. I’d be happy if any Debian Developer would work as carefully coding and working.


Thank you to Maximilian for the time spent answering my questions. I hope you enjoyed reading his answers as I did. Subscribe to my newsletter to get my monthly summary of the Debian/Ubuntu news and to not miss further interviews. You can also follow along on Identi.ca, Twitter and Facebook.

Debian Cleanup Tip #3: get rid of third-party packages

Last week, we learned how to get rid of obsolete packages. This time, we’re going to learn how to bring back your computer to a state close to a “pure” Ubuntu/Debian installation.

Thanks to the power of APT, it’s easy to add new external repositories and install supplementary software. Unfortunately some of those are not very well maintained. They might contain crappy packages or they might simply not be updated. An external package which was initially working well, can become a burden on system maintenance because it will be interfering with regular updates (for example by requiring a package that should be removed in newer versions of the system).

So my goal for today is to teach you how to identify the packages on your system that are not coming from Debian or Ubuntu. So that you can go through them from time to time and keep only those that you really need. Obsolete packages are a subset of those, but I’ll leave them alone. We took care of them last week.

Each (well-formed) APT repository comes with a “Release” file describing it (example). They provide some values that can be used by APT to identify packages contained in the repository. All official Debian repositories are documented with Origin=Debian (and Origin=Ubuntu for Ubuntu). You can verify the origin value associated to each repository (if any) in the output of apt-cache policy:

[...]
 500 http://ftp.debian.org/debian/ lenny/main i386 Packages
     release v=5.0.8,o=Debian,a=stable,n=lenny,l=Debian,c=main
     origin ftp.debian.org
[...]

From there on, we can simply ask aptitude to compute a list of packages which are both installed and not available in an official Debian repository:

$ aptitude search '?narrow(?installed, !?origin(Debian))!?obsolete'
or
$ aptitude search '~S ~i !~ODebian !~o'

You can replace “search” with “purge” or “remove” if you want to get rid of all the packages listed. But you’re more likely to want to remove only a subset of carefully chosen packages… you’re probably still using some of the software that you installed from external repositories.

With synaptic, you can also browse the content of each repository. Click on the “Origin” button and you have a list of repositories. You can go through the non-Debian repositories and look which packages are installed and up-to-date.

But you can do better, you can create a custom view. Click on the menu entry “Settings > Filter”. Click on “New” to create a new filter and name it “External packages”. Unselect everything in the “Status” tab and keep only “Installed”.

Go in the “Properties” tab and here add a new entry “Origin” “Excludes” “ftp.debian.org”. In fact you must replace “ftp.debian.org” with the hostname of your Debian/Ubuntu mirror. The one that appears on the “origin” line in the output of apt-cache policy (see the excerpt quoted above in this article).

Note that the term “Origin” is used to refer to two different things, a field in the release file but also the name of the host for an APT repository. It’s a bit confusing if you don’t pay attention.

Close the filters window with OK. You now have a new listing of “External packages” under the “Custom Filters” screen. You can see which packages are installed and up-to-date and decide whether you really want to keep it. If the package is also provided by Debian/Ubuntu and you want to go back to the version provided by your distribution, you can use the “Package > Force version…” menu entry.

Click here to subscribe to my free newsletter and get my monthly analysis on what’s going on in Debian and Ubuntu. Or just follow along via the RSS feed, Identi.ca, Twitter or Facebook.

Best practices when sponsoring Debian packages

Sponsoring a package means uploading a package for someone else (usually a new contributor starting out as package maintainer). This is an activity reserved to Debian Developer who are supposed to be knowledgeable about packaging. This article tries to document the process to ensure the sponsor is doing a reasonably good job according to Debian’s standards.

Sponsoring a package in the Debian archive is not a trivial matter. It means that you verified the packaging and that it is of the level of quality that Debian strives to have. Let’s have a look to what you can and should do when you’re sponsoring a package.

Sponsoring the initial upload

Sponsoring a brand new package into Debian requires a thorough review of the Debian packaging. Building the package and testing the software is definitely not enough! You should open every file in the debian directory and look out for potential problems. Here’s a checklist that you can use to perform the audit:

  • Verify that the upstream tarball provided is the same that has been distributed by the upstream author (when the sources are repackaged for Debian, generate the modified tarball yourself).
  • Run lintian. It will catch many common problems. Be sure to verify that any lintian overrides setup by the maintainer is fully justified.
  • Run licensecheck and verify that debian/copyright seems correct and complete. Look for license problems (like files with “All rights reserved” headers, or with a non-DFSG compliant license).
  • Build the package with pbuilder (or any similar tool) to ensure that the build-dependencies are complete.
  • Proofread debian/control: does it follow the best practices? are the dependencies complete?
  • Proofread debian/rules: does it follow the best practices? do you see some possible improvements?
  • Proofread the maintainer scripts (preinst, postinst, prerm, postrm, config): will the preinst/postrm work when the dependencies are not installed? are all the scripts idempotent (i.e. can you run them multiple times without consequences)?
  • Review any change to upstream files (either in .diff.gz, or in debian/patches/ or directly embedded in the debian tarball for binary files). Are they justified? Are they properly documented (with DEP-3 for patches)?
  • For every file, ask yourself why the file is there and whether it’s the right way to achieve the desired result. Is the maintainer following the best packaging practices described by the Developers Reference?
  • Build and install the packages, try the software. Ensure you can remove and purge the packages. Maybe test the packages with piuparts.

If the audit did not reveal any problem, you can upload the package. But remember that even if you’re not the maintainer, the sponsor is still responsible of what he uploaded to Debian. That’s why you’re encouraged to keep up with the package through the Package Tracking System.

Sponsoring an update of an existing package

You will usually assume that the package has already gone through a full review. So instead of doing it again, you will carefully analyze the difference between the current version and the new version prepared by the maintainer. If you have not done the initial review yourself, you might still want to have a more deeper look just in case the initial reviewer was sloppy.

To be able to analyze the difference you need both versions. Download the current version of the source package (with apt-get source) and rebuild it (or download the current binary packages with aptitude download). Download the source package to sponsor (usually with dget).

Read the new changelog entry, it should tell you what to expect during the review. The main tool you will use is debdiff, you can run it with two source packages (.dsc files), or two binary packages, or two .changes files (it will then compare all the binary packages listed in the .changes).

If you compare the source packages (excluding upstream files in the case of a new upstream version, for example by filtering the output of debdiff with filterdiff -i '*/debian/*'), you must understand all the changes you see and they should be properly documented in the Debian changelog.

If everything is fine, build the package and compare the binary packages to verify that the changes on the source package have no unexpected consequences (like some files dropped by mistake, missing dependencies, etc.).

You might want to check out the Package Tracking System to verify if the maintainer has not missed something important. Maybe there are translations updates sitting in the BTS that could have been integrated. Maybe the package has been NMUed and the maintainer forgot to integrate the changes from the NMU in his package. Maybe there’s a release critical bug that he has left unhandled and that’s blocking migration to testing. Whatever. If you find something that she could have done (better), it’s time to tell her so that she can improve for next time. And so that she has a better understanding of her responsibilities.

If you have found no problem, upload the new version. Otherwise ask the maintainer to provide you a fixed version.


This article will be repurposed to enhance the Debian Developers Reference, hopefully leading to a fix for the wishlist bug #453313. Click here and help me fix more of those.

You’re also welcome to suggest improvements in the comments. Are there other checks that you’re always doing? Do you have some handy tip to make it easier to review a package?

Debian Cleanup Tip #2: Get rid of obsolete packages

Last week, we learned to remove useless configuration files. This week, we’re going to take care of obsolete packages.

An obsolete package is a package who is no longer provided by any of the APT repositories listed in /etc/apt/source.lists (and /etc/apt/sources.list.d/). There can be multiple reasons why a package is no longer available in the repository (or at least not under the same name) :

  • the upstream author stopped maintaining the software a long time ago, nobody else took over and the Debian maintainer preferred to remove the package from Debian. Usually there are alternatives in the Debian archive.
  • the package was orphaned in Debian since a long time, nobody took over and it had very few users. The Debian QA team might have asked its removal.
  • the latest version of the software might have been packaged under a new package name. Either because the amount of changes was so important that it was preferred to not upgrade automatically to the latest version (it has been the case with request-tracker and nagios, they both embed a version number in their package names), or simply because the maintainer wants to let the user install several versions at the same time (that’s the case for example with the Linux kernel, the python interpreter and many libraries).
  • the software has been renamed, the maintainer renamed the packages and kept transitional packages under the old name for one release. Then the transitional packages have been removed.

In any case, it’s never a good idea to keep obsolete packages around: they do not benefit from security updates and they might cause problems during upgrades if they depend on other packages that should be removed to complete the upgrade.

You could blindly remove them with aptitude purge ~o (or aptitude purge ?obsolete) but you might want to first verify what those package are. There might be some packages that you have manually installed, that are not part of any current APT repository, and that you want to keep around nevertheless (I have skype, dropbox and a few personal packages for example). You can get the list with aptitude search ?obsolete

With the graphical package manager (Synaptic), you can find the list of obsolete packages by clicking on the “Status” button and selecting “Installed (local or obsolete)”. You can then go through the list and decide for each package whether you want to keep it or not.

Follow me on Identi.ca, Twitter and Facebook. Or subscribe to this blog by RSS or by email.

Debian 6.0 is out, Wheezy kicks off

As you probably already know, Debian released Squeeze aka Debian 6.0 this week-end. It was a really great week-end.

I saw quite a few release already, but none with so much online social activity. Alexander and Meike Schmehl live commented the release on the Debian identi.ca account and Joey Hess held his Debian Party Line.

On top of this, the team in charge of the website rolled out a new design on quite a few online services during the week-end including www.debian.org, wiki.debian.org, planet.debian.org and more.

Congratulations to all the people who made this happen (and the release team in particular). It’s great to see Debian achieve all of this.

Do you know that it’s the third time in a row that we manage to release in the 18-24 months timeframe? 3.1: June 2005 → 4.0: April 2007 → 5.0: February 2009 → 6.0: February 2011.

Yes, despite our size and the fact that we are all volunteers, we have managed to stick to a reasonable schedule for a stable distribution that is deployed on a large scale.

Wheezy kicks off

The best part of the release for us — the developers — is that wheezy is now open for development and we can work on new features for the next release. ;-)

And it started quickly: according to UDD, wheezy already features 488 new source packages that are not in squeeze, 1713 updated source packages and among those 1246 are new upstream versions.

I really look forward to the upcoming projects and related discussions.

Click here to subscribe to my free newsletter and get my monthly analysis on what’s going on in Debian and Ubuntu. Or just follow along via the RSS feed, Identi.ca, Twitter or Facebook.


For the curious, here are the UDD queries I used:

# Updated packages in wheezy
select count(source) from sources_uniq as su where (select version from sources_uniq where release='squeeze' and distribution='debian' and source = su.source) < su.version and release='wheezy' and distribution='debian';
# Updated with a new upstream version
select count(source) from sources_uniq as su where debversion(regexp_replace((select version from sources_uniq where release='squeeze' and distribution='debian' and source = su.source), '-.*', '')) < debversion(regexp_replace(su.version, '-.*', '')) and release='wheezy' and distribution='debian';
# New package in wheezy
select count(source) from sources_uniq as su where source not in (select source from sources_uniq where release='squeeze' and distribution='debian' and source = su.source) and release='wheezy' and distribution='debian';