Deciphering one of dpkg’s weirdest errors: short read on buffer copy

As a Debian/Ubuntu user, you’re likely to be exposed at some point to an error reported by dpkg. In a series of articles, I’ll explain some of the errors that you might encounter.

Some error messages can be confusing at times. Most of the error strings do not appear very often and developers thus tend to use very terse description of the underlying problem. In other cases the architecture of the software makes it difficult to pin-point the real problem because the part that displays the error is several layers above the one that generated the initial error.

This is for example the case with this error of dpkg:

Unpacking replacement xulrunner-1.9.2 ...
dpkg-deb (subprocess): data: internal gzip read error: '<fd:0>: too many length or distance symbols'
dpkg-deb: error: subprocess <decompress> returned error exit status 2
dpkg: error processing /var/cache/apt/archives/xulrunner-1.9.2_1.9.2.17+build3+nobinonly-0ubuntu1_amd64.deb (--unpack):
 short read on buffer copy for backend dpkg-deb during `./usr/lib/xulrunner-1.9.2.17/components/libdbusservice.so'

First, the decompression layer discovers something unexpected in the data read in the .deb file and dpkg-deb outputs the error message coming from zlib (“too many length or distance symbols”). This causes the premature end of dpkg-deb --fsys-tarfile that dpkg had executed to extract the .data.tar archive from the deb file. In turn, dpkg informs us that dpkg-deb did not send all the data that were announced (and hence the “short read” in the error message) and that were meant to be part of the file ‘/usr/lib/xulrunner-1.9.2.17/components/libdbusservice.so’.

That’s all nice but it doesn’t help you much in general. What you must understand from the above is that the .deb file is corrupted (sometimes just truncated). In theory it should not happen since APT verifies the checksums of files when they are downloaded. But computers are not infallible and even if the downloaded data was good, it can have been corrupted when stored on disk (for example cheap SSD disks are known to not last very well).

Try removing the file (usually with apt-get clean since it’s stored in APT’s cache) and let APT download it again. Chances are that it will work on the second try. Otherwise consider doing a memory and HDD check as something is probably broken in your computer.

Join my free newsletter and learn more tips for users. Or click here to support my work on dpkg with Flattr, consider subscribing for a few months.

apt-get, aptitude, … pick the right Debian package manager for you

This is a frequently asked question: “What package manager shall I use?”. And my answer is “the one that suits your needs”. In my case, I even use different package managers depending on what I’m trying to do.

APT vs dpkg, which one is the package manager?

In the Debian world, we’re usually thinking of APT-based software when we’re referring to a “package manager”. But in truth, the real package manager is dpkg. It’s the low-level tool that takes a .deb file and extracts its content on the disk, or that takes the name of a package to remove the associated files, etc.

APT is better known because it’s the part of the packaging infrastructure that matters to the user. APT makes collection of software available to the user and does the dirty work of downloading all the required packages and installing them by calling dpkg in the correct order to respect the dependencies.

But APT is not a simple program, it’s a library and several different APT frontends have been developed on top of that library. The most widely known is apt-get since it’s the oldest one, and it’s provided by APT itself.

Graphical APT front-ends

update-manager is a simple frontend useful to install security updates and other trivial daily upgrades (if you’re using testing or sid). It’s the one that you get when you click in the desktop notification that tells you that updates are available. In cases, where the upgrade is too complicated for update-manager, it will suggest to run synaptic which is full featured package manager. You can browse the list on installed/available packages in numerous ways, you can mark packages for installation/upgrade/removal/purge and then run in one go all the recorded actions.

software-center aims to be an easy to use application installer, it will hide most of the packaging details and will only present installed/available applications (as defined by a .desktop file). It’s very user friendly and has been developed by Ubuntu.

Of the graphical front-ends, I use mainly synaptic and only when I’m reviewing what I have installed to trim the system down.

Console-based GUI APT front-ends

In this category, I’ll cite only aptitude. Run without parameter, it will start a powerful console-based GUI. Much like synaptic, you can have multiple views of the installed/available packages and mark packages for installation/upgrade/removal/purge before executing everything at once.

Command-line based package managers and APT front-ends

This is where the well known apt-get fits, but there are several other alternatives: aptitude, cupt, wajig. Wajig and cupt are special cases as they don’t use libapt: the former wraps several tools including apt-get, and the latter is a (partial) APT reimplementation (versions 1.x were in Perl, 2.x are now is C++).

You’re welcome to try them out and find out which one you prefer, but I have never felt the need to use something else than apt-get and aptitude.

apt-get or aptitude?

First I want to make it clear that you can use both and mix them without problems. It used to be annoying when apt-get did not track which packages were automatically installed while aptitude did, but now that both packages share this list, there’s no reason to avoid switching back and forth.

I would recommend apt-get for the big upgrades (i.e. dist-upgrade from one stable to the next) because it will always find quickly a relatively good solution while aptitude can find several convoluted solutions (or none) and it’s difficult to decide which one should be used.

On the opposite for regular upgrades in unstable (or testing), I would recommend “aptitude safe-upgrade“. It does a better job than apt-get at keeping on hold packages which are temporarily broken due to some not yet finished changes while still installing new packages when required. With aptitude it’s also possible to tweak dynamically the suggested operations while apt-get doesn’t allow this. And aptitude’s command line is probably more consistent: with apt-get you have to switch between apt-get and apt-cache depending on the operation that you want to do, aptitude on the other hand does everything by itself.

Take some time to read their respective documentation and to try them.

Click here to subscribe to my free newsletter and get my monthly analysis on what’s going on in Debian and Ubuntu. Or just follow along via the RSS feed, Identi.ca, Twitter or Facebook.

My Debian activities in May 2011

This is my monthly summary of my Debian related activities. If you’re among the people who made a donation to support my work, then you can learn how I spent your money. Otherwise it’s just an interesting status update on my various projects.

I have been…

Doing some work towards Debian Rolling

At the start of the month, the discussions about Debian rolling were still very active on debian-devel. Declaring that testing would be rolling did not make it (as I hoped), the argument that some RC bugs last for far too long in that distribution carried the discussion and thus the most consensual proposition ended up being the one of Josselin Mouette were rolling would be testing plus a few selected cherry-picked packages from unstable.

I believe it’s a workable solution if we only care about a subset of architectures. Otherwise the same reasons that keep the fixed packages out of testing would probably also apply for rolling.

Given this, I did setup britney (the software that controls testing) on my laptop to investigate how we can create rolling. It turns out britney is a very specialized software with very few configuration knobs.

At the same time Joachim Breitner made a proposition that immediately grabbed my attention. He suggests to use SAT solvers to find out the set of packages that should migrate from unstable to testing. I thought that rolling would be a good testbed for this new implementation of britney (which he calls SAT-britney) so I jumped right in this project.

I was not at all familiar with this science field, so I looked up quite some documentation: I learned that all SAT solvers expect the problem to be presented in CNF form, and that DIMACS was the file format of choice to represent those boolean constraints. Several SAT solvers are available in Debian and picosat appears to be one of the best.

Then I started some early coding/prototyping to play with the concept. You can find the result in this git repository, you can grab a copy with git clone git://git.debian.org/~hertzog/sat-britney.git.

There’s not much yet, except some Python code to generate a SAT problem that can be fed to a SAT solver. But I really look forward to this project.

Representing Debian during Solutions Linux

During the second week, I spent 3 days in Paris to help manage the Debian booth at Solutions Linux.

We have responded to lots of queries but most visitors already knew Debian, and many of them use it at work and/or at home. We tried to recruit those people as new members for Debian France, the local association. We also sold all our remaining goodies.

The Ubuntu people were interviewed by France 3 (an important TV channel) and we took this opportunity (with the consent of the Ubuntu guys) to show our Debian t-shirts in the background: you can watch the video here (in French), you can see me with Carl Chenet at 1:21.

We have also been interviewed by Intelli’n TV: here and here (both in French). I’m not very good at this exercise. :-)

Improving dpkg triggers

The third week was a vacation week, in theory I should have stayed away from my computer but I really wanted to take this opportunity to improve the state of dpkg triggers in Debian.

I already covered my work in another article: Trying to make dpkg triggers more useful and less painful.

The result is not merged yet, I just asked a question to all package maintainers who are using triggers to be able to decide whether I’ll merge it as is, or if I can make the new behavior the default one.

Supporting users after Alioth’s migration

When I came back from my vacation, many services provided by Alioth.debian.org were non-functional after a migration to a new setup that involves two machines instead of one. Given that I used to be an Alioth admin, I know that in those periods you tend to be get bogged down on many user support requests. So I re-joined #alioth on IRC and tried to help a bit.

I did investigate some of the reported problems and prepared fixes (updated scripts, configuration files, etc.) for some of the issues. I also created a list of remaining issues that should have lasted only a few days but that’s still active because there are still regressions left.

The most important things still missing are:

  • proper support for delegation of rights. We used ACL setup by the admins in the past. With the new FusionForge, each project admin should be able to delegate rights to external “roles”. There’s a Debian Developer role already but trying to grant him right fails…
  • access to the Ultimate Debian Database. Many tools rely on this database to work.
  • anonymous FTP access to download project files.
  • clear guidelines on how we’re supposed to deal with websites that are updated by VCS hooks.
  • clear guidelines on how we’re supposed to deal with personal git repositories

Improving the “3.0 (quilt)” source format

I have made some proposals to change the way the new source format would work. The goals are to be less painful for packagers who are using a VCS, and to avoid unexpected changes slipping through a new patch generated by dpkg-source.

It seems that the proposals are relatively consensual so I’ll implement them at some point.

Missing in action on my blog

I did a lots of stuff for Debian between travel and vacation, and in the remaining time, I did not manage to write many articles for my blog.

In fact, besides the article on my triggers work mentioned above I only published one interview: People behind Debian: Steve Langasek, release wizard.

I’ll try to do better this month!

Thanks

Many thanks to the people who gave me 151.61 € in May.

See you next month for a new summary of my activities.

Trying to make dpkg triggers more useful and less painful

Lately I have been working on the triggers feature of dpkg. I would like to share my plan and what I have done so far. I’ll first explain what triggers are, the current problems, and the work I did to try to improve the situation.

Introduction

Dpkg triggers are a neat feature of dpkg that package can use to send/receive notifications to/from other installed packages. Those notifications take the form a simple string.

This feature is heavily used to track changes of packaged files in a list of predefined directories, and to update other files based on this. For instance, man-db is watching the directories containing manual pages so that it can update its cache (in /var/cache/man/). install-info is updating the index of info pages when there have been changes in /usr/share/info. gnome-menus is updating its own copy of the menu hierarchy (with entries from /etc/gnome/menus.blacklist blacklisted) every time that a .desktop file is installed/updated/removed.

From a user’s perspective

You see triggers in action very often during upgrades (in fact too often as we’ll see it later):

Preparing to replace zim 0.52-1 (using .../archives/zim_0.52-1_all.deb) ...
Unpacking replacement zim ...
Processing triggers for shared-mime-info ...
Processing triggers for menu ...
Processing triggers for desktop-file-utils ...
Processing triggers for man-db ...
Processing triggers for hicolor-icon-theme ...
Processing triggers for python-support ...
Processing triggers for gnome-menus ...
Setting up zim (0.52-1) ...
Processing triggers for python-support ...
Processing triggers for menu ...

As you guessed it, those “Processing triggers” lines correspond to the packages which received (one or more) trigger notifications and which are doing the corresponding task.

By default the triggers are processed at the end of the dpkg --unpack invocation which is often too soon because APT will often call dpkg --unpack repeatedly during important upgrades. There are some options to ask APT to use dpkg’s --no-triggers option in order to defer the trigger processing at the end of the APT run. You can put this in /etc/apt/apt.conf.d/triggers:

// Trigger deferred
DPkg::NoTriggers "true";
PackageManager::Configure "smart";
DPkg::ConfigurePending "true";
DPkg::TriggersPending "true";

I have now asked APT maintainers to use those options by default, I filed bug #626599 to track this. At the same time I fixed bug #526774 reported by APT maintainers. This bug forced them to put a work-around in APT which resulted in running triggers sooner than expected.

(And while writing this article I filed bug #628564 and #628574 because it was clearly not normal that the menu triggers was executed twice for the installation of a single package)

From a packager’s perspective

The implementation of triggers has several consequences on the status that packages can have.

Let’s assume that the package A installs a file in a directory that is watched by package B (and that B is currently in the “installed” state). When A is unpacked, dpkg adds B to its “Triggers-Awaited” field and lists the activated trigger in B’s “Triggers-Pending” field. Package A is in “unpacked” state, but B has been changed to “triggers-pending”.

When A is configured, instead of going to the “installed” state, it will go to the “triggers-awaited” state. In that state the package is assumed to NOT fulfill dependencies. However, B—which is still in “triggers-pending” state—does fulfill dependencies.

A and B will switch to “installed” at the same time when the trigger has been processed.

The fact that the triggers-awaited status does not fulfill dependencies means that some common triggers like man-db have to be processed regularly just to be able to ensure dependencies are satisfied before running the postinst of other installed packages.

But a package which ships a manual page can certainly be considered as configured when its postinst has been run even if man-db has not yet updated its cache to know about the new/updated manual page.

When you activate a trigger with the dpkg-trigger command you have an option --no-await to avoid awaiting the trigger processing (and thus to go directly to installed state after the postinst has been run). But with file triggers or activate trigger directives, you do not have this option.

My proposal to improve the situation

This is the problem that I tried to solve during my last vacation. But before changing the inner working of triggers, I wrote a non-regression test suite for that feature (commit here) so I could hack with some confidence that I did not break everything.

The result has been presented on the debian-dpkg mailling list: see the discussion here. I added new directives that can be used in triggers files that work exactly like the current triggers except that they do not put triggering packages in trigger-awaited status.

I believe the code to be mostly ready, but in its current form the patch brings zero benefits until all packages have been converted to use the trigger variants that do not require awaiting trigger processing (and the change requires a pre-dependency on dpkg to ensure we have the required dpkg that understands the new kind of trigger directives).

Remaining question

Thus I wonder if I should not change the default semantic of triggers. The packages which really provide crucial functionality to awaiting packages through triggers would then have to be updated to switch to the new directives.

If you’re a packager using triggers, you can thus help me by answering this question: do you know some triggers where it’s important that the awaiting packages are not considered as configured before the trigger processing? In most of the cases I checked, it’s important for the triggered package rather than for the triggering package.

In truth, a package in triggers-awaited status is usually in a good enough shape to be able to satisfy dependencies (i.e. requirements that other packages can have), but it would still be worth to record the fact that it’s not entirely configured yet because it might be true from the user’s point of view: for example if the menu trigger has not yet been processed, the software might not yet be visible in the application menu.

If you appreciate this kind of groundwork that benefits to the whole Debian ecosystem, please consider supporting my work. Click here and give it a look, there are many ways to contribute and to make a difference for me.

March 2011 wrap up

Since I’m soliciting donations to support my Debian work, the least I can do is explain what I do. You can thus expect to see an article like this one every month.

Multi-Arch work

I updated the code to use another layout for the control files stored in /var/lib/dpkg/info/. Instead of using a sub-directory per architecture (arch/package.type), we decided to use package:arch.type but only for packages which are Multi-Arch: same. dpkg is taking care to rename the files the first time it is executed with write rights and then updates /var/lib/dpkg/info/format to remember that the upgrade has been done and that we can rely on the new structure.

I filed a few bugs on packages that are improperly accessing those internal files instead of using the appropriate dpkg-query interface. I sent a heads-up mail on -devel to make other people aware of those problems in the hope to discover most of them as early as possible.

After that, the work stalled because Guillem went away for 2 weeks and thus stopped his review of my work. I hope he will quickly resume the review and that we will get something final this month.

With the arrival of dpkg 1.16.0, it’s now possible to start converting libraries to multi-arch even if full multi-arch support has not yet landed in dpkg proper. See http://wiki.debian.org/Multiarch/Bootstrapping for the detailed plan.

If you’re curious about Multi-Arch, you might want to read this article of Steve Langasek as well.

Bug triage for dpkg in launchpad

At the start of the month, there was close to 500 bugs reported against the dpkg package in Launchpad. Unfortunately most of it is noise… many of the reported bugs are misfiled, they show an upgrade problem of a random package and that upgrade problem confuses update-manager which tries to configure an already configured package. This generates a second error that apport attributes to dpkg and the resulting bug report is thus filed on dpkg. There are literally hundreds of those that have to be reclassified.

Michael Vogt and Brian Murray did some triaging, and I also spend quite some hours on this task. It’s a bit frustrating as I tend to mark many reports “Incomplete” because there’s no way they can be acted upon and many of them are so old that the reporter is unlikely to be able to provide supplementary information.

But in the middle of this noise, there are some useful bug reports, like LP#739179 which enabled me to fix a regression even before it reached Debian Unstable (because Ubuntu runs a snapshot of dpkg with multiarch support).

I subscribed to the Launchpad bugs for dpkg via the Debian Package Tracking System (thanks to the derivatives-bugs keyword) and will try to keep up with the incoming reports.

Misc dpkg work

The ftpmasters came up with a request for a new field (see 619131) in source packages. After a quick discussion and a round of review on debian-policy@l.d.o, I implemented the new Package-List field. This should allow the ftpmasters to save some time in NEW processing, but we deferred the change for the next dpkg version (1.16.1) to ponder a bit more on the design of the field.

I also fixed a bunch of bugs (#619541, #605719, #598922, #616096) and merged a patch of Mark Hymers to recognize the new Built-Using field.

Developers-reference work

The review process for changes to the developers-reference is not working as it should. And I suffered from it while trying to integrate the patch I wrote for the “Developer duties” chapter (see #548867).

We purposely changed the maintainer field from debian-doc to debian-policy in the hope to have more reviews of suggested changes and to seek some sort of consensus before committing anything. But we don’t get more reviews… and deciding to commit a patch is now even harder than it was (except for trivial stuff where personal opinions can’t interfere).

In my case, I only got the feedback of Charles Plessy which was very mixed to say the least. I tried to improve my patch based on what he expressed but I also clearly disagreed with some of his assertions and was convinced that my wording was in line with the dominant point of view within Debian.

We tried to involve the release team in the discussion because most of what I documented was about helping making stable release happen, but nobody of the team answered.

Instead of letting the situation (and my patch) rot, I solicited feedback from the DPL and from another developers-reference editor to see whether my patch was an improvement or not. After some more time, I went ahead and committed it.

It was not pleasant for anyone.

I don’t know how we can improve this. Contrary to the policy, the developers-reference is a document that is not normative, I believe the result is better when we put some “soul” into it. But it’s a real challenge when you seek a consensus and that the interest in reviewing changes is so low.

DVD shop listed on debian.org

In February, I launched a DVD shop whose benefits are used to fund my Debian work. Shortly after the launch I used the official form to be added to the official listing of Debian CD vendors and offered a few suggestions to deal with vendors who are selling unofficial images (with firmware in my case).

A few weeks later, I got no answers: neither for my request nor for my suggestions, I mailed the cdvendors@debian.org team directly asking for a status update and quickly got an answer suggesting that Simon Paillard usually does the work and can’t process the backlog due to some injury. At this point no concerns had been raised about adding me to the list. To save some time and some work for the team, I added myself to the list since I had commit rights and I informed them that I did it, so that they can review it.

Shortly after I did that, Martin Zobel Helas objected to my addition. I cleared some misunderstandings but the discussion also lead to some changes to please everybody: the listing now indicates that some images are unofficial and I have prepared a special landing page for people coming from the Debian website through this listing.

Debian column on OMG! Ubuntu

I have always been a firm believer that it’s important for Debian to reach out to the widest public with its message of freedom. Thus when Benjamin Humphrey contacted the debian-publicity team to find volunteers to write a Debian column on OMG! Ubuntu, I immediately jumped in.

I wrote 4 articles over there. The tone is very different from my articles on my blog and I like that duality. Check out Debian is dying! Oh my word!, Debian or Ubuntu, which is the best place to contribute?, Are you contributing your share? and Ubuntu’s CTO reveals DEX: an effort to close the gap with Debian.

It’s a great win-win situation, OMG! Ubuntu benefits from my articles, Debian’s values are relayed further, and OMG! Ubuntu’s large audience also helps me develop my own blog.

Work on my book

I had lots of paperwork to do this month (annual accounting stuff for my company) and I did not have as much time as I hoped for my book. Still I have a updated a few more chapters of my French book and I certainly hope to complete the update during April.

This means that the work on the English translation could start in may.

Work on my blog

Just like for my book, it has been relatively difficult for me to cope with my policy of two articles every week. But I still managed to get quite some good stuff out.

I interviewed Christian Perrier (Debian’s translation coordinator) and also Bdale Garbee (chair of Debian’s technical committee).

I finished my series of “Debian Cleanup Tips” with 2 supplementary articles:

The removal of firmware is causing troubles to quite some users so I wrote an article explaining how to deal with the problem. A regular reader also asked me to write an article about Jigdo, I executed myself because it was a good idea and that he has been very nice with me: Download ISO images of Debian CD/DVD at light speed with Jigdo.

Last but not least, I shared my package maintainer pledge which inspired my developers-reference patch (see discussion above).

Thanks

Many thanks to all the people who showed their appreciation of my work. The 324.37 EUR that you gave me in February represented 2 days and a half of my time that I have spent working on the above projects.

See you next month for a new summary of my activities.

February 2011 wrap up

February has been again a busy month for me. Here’s a quick summary of what I did:

Multi-Arch work

I have spent many days implementing and refining dpkg’s Multi-Arch support with Guillem Jover (dpkg co-maintainer) and Steve Langasek (beta-tester of my code ;-)). Early testers can try what’s in my latest pu/multiarch/snapshot/* branch in my personal git repository.

A Debian DVD shop

I’m always exploring new options to fund my Debian work (besides direct donations) and this month—with the Debian Squeeze release—I saw an opportunity in selling Debian DVD. Nobody provides DVD with included firmwares and quite a few people would like to avoid the SpaceFun theme. So I built unofficial Debian DVDs that integrate firmware and that install a system with the old theme (MoreBlue Orbit). Click here to learn more about my unofficial DVDs.

On my blog

In my “People behind Debian” series, I interviewed Mike Hommey (Iceweasel maintainer) and Maximiliam Attems (member of the kernel team).

I started a “Debian Cleanup Tip” series and already published 4 installments:

For contributors, I wrote two articles: the first gives a set of (suggested) best practices for sponsoring Debian packages and adapted my article as a patch for the Developers Reference. In the second article, I shared some personal advice for people who are considering participating on Debian mailing list: 7 mistakes to avoid when participating to Debian mailing lists.

Click here to subscribe to my free newsletter and get my monthly analysis on what’s going on in Debian and Ubuntu. Or just follow along via the RSS feed, Identi.ca, Twitter or Facebook.

Review of my Debian related goals for 2010

Last year I shared my “Debian related goals for 2010”. I announced that I would not be able to complete them all and indeed I have not, but I have still done more than what I expected. Let’s have a look.

Translate my Debian book into English and get it published: NOT DONE

Or rather not yet done. It’s still an important project of mine and I will do it this year. When I wrote this last year, I expected to find a publisher that would take care of everything but we failed to find a suitable one so we’re going to do it ourselves.

Cleanup the dpkg perl API and create libdpkg-perl: DONE

libdpkg-perl has been introduced with dpkg 1.15.6.

Create dpkg-buildflags: DONE

dpkg-buildflags has been introduced with dpkg 1.15.7. It’s not widely used yet and it won’t really be until debhelper 7 supports it (see #544844). It would be nice to see progress on this front this year.

Ensure the new source formats continue to gain acceptance: DONE

The adoption rate has been steady, it clearly slowed down since the freeze though. I have implemented quite a few features to satisfy the needs of users, like the possibility to unapply the patches after the build, or the possibility to fail in case of unwanted upstream changes.

Design a generic vcs-buildpackage infrastructure to be integrated in dpkg-dev: NOT DONE

I believe it’s something important on the long term but it never made it to my short-term TODO list and it’s unlikely to change any time soon.

Continue fixing dpkg bugs faster than they are reported: PARTLY DONE

We have dealt with many bugs over the year, but we still have 20 more bugs than at the start of last year (370 vs 350). We’re not doing bad compared to many other Debian teams but we can still benefit from some help. Start here if you’re interested.

Enhance our infrastructure to ease interaction between contributors and to have a better view of how each package is maintained: NOT DONE

I am convinced that we need something to have a clearer idea of the commitments made by each contributor. I don’t put the same amount of care in maintaining smarty-gettext that I do on dpkg. If we had a database of the stuff that we know we don’t do well enough, it’s easier to point new contributors towards those.

Anyway, this project is still unlikely to come to the top of my priorities any time soon.

Work on the developers-reference: NOT DONE

We have switched the Maintainer field to debian-policy@lists.debian.org to have more review of the changes suggested through the bug tracking system but that has not changed much on the global situation.

I still hope to become more active on it sometimes this year. Maybe by trying to make it more fun and creating the text for some of the wishlist bugs as blog articles first.

Rewrite in C the last Perl scripts provided by the dpkg binary package: DONE

Dpkg 1.15.8 was the first version working without Perl, I announced it in July.

Integrate the 3-way merge tool for Debian changelogs in dpkg-dev: DONE

dpkg-mergechangelogs is part of dpkg-dev since 1.15.7.

I enjoy it regularly. Unfortunately it doesn’t work well for cherry-picks. Would be nice to see this fixed, anyone up to the task? :-)

Click here to subscribe to my free newsletter and get my monthly analysis on what’s going on in Debian and Ubuntu. Or just follow along via the RSS feed, Identi.ca, Twitter or Facebook.

Avoid a newbie packager mistake: don’t build your Debian packages with dpkg -b

In the last years, I have seen many people try to use dpkg --build to create Debian packages. Indeed, if you look up dpkg’s and dpkg-deb‘s manual pages, this option seems to be what you have to use:

-b, --build directory [archive|directory]

Creates a debian archive from the filesystem tree stored in directory. directory must have a DEBIAN subdirectory, which contains the control information files such as the control file itself. This directory will not appear in the binary package’s filesystem archive, but instead the files in it will be put in the binary package’s control information area.

And indeed, dpkg-deb is what ultimately creates the .deb files (aka binary packages). But it’s a low-level tool that you should not call yourself. If you want to properly package a new software, you should rather create a Debian source package that will transform upstream source code into policy-compliant binary packages.

Creating a source package also involves preparing a directory tree (but with a “debian” sub-directory), it’s probably more complicated than calling dpkg -b on a manually crafted directory. But the result is much more versatile: the tools used bring value by dynamically analyzing/modifying the files within your package (for example, the dependencies on C libraries that your package needs are automatically inserted).

If this is news to you, you might want to check out the New Maintainers’ Guide and the Debian Policy.

Found it useful? Be sure to not miss other packaging tips (or lessons), click here to subscribe to my free newsletter and get new articles by email.

Save disk space by excluding useless files with dpkg

Most packages contain files that you don’t need: for example translations in languages that you don’t understand, or documentation that you don’t read. Wouldn’t it be nice if you could get rid of them and save a few megabytes? Good news: since dpkg 1.15.8 you can!

dpkg has two options --path-include=glob-pattern and --path-exclude=glob-pattern that control what files are installed or not. The pattern work the same than what you’re used to on the shell (see the glob(7) manual page).

Passing those options on the command-line would be impractical, so the best way to use them is to put them in a file in /etc/dpkg/dpkg.cfg.d/. Beware, the order of the options does matter: when a file matches several options, the last one makes the decision.

A typical usage is to first exclude a directory and then to re-include parts of that directory that you want to keep. For example if you want to drop gettext translations and translated manual pages except French, you could put this in /etc/dpkg/dpkg.cfg.d/excludes:

# Drop locales except French
path-exclude=/usr/share/locale/*
path-include=/usr/share/locale/fr/*
path-include=/usr/share/locale/locale.alias

# Drop translated manual pages except French
path-exclude=/usr/share/man/*
path-include=/usr/share/man/man[1-9]/*
path-include=/usr/share/man/fr*/*

Note that the files will vanish progressively every time that a package is upgraded. If you want to save space immediately, you have to reinstall the packages present in your system. aptitude reinstall or apt-get --reinstall install might help. In theory with aptitude you can even do aptitude reinstall ~i but it tends to not work because one package is not available (either because it was installed manually or because the installed version has been superseded by a newer version on the mirror).

Found it useful? Click here to see how you can encourage me to provide more articles like this one.

Latest features of dpkg-dev: debian packaging tools

I’m attending the mini-Debconf Paris and I just gave a talk about the latest improvement of dpkg-dev—the package providing the basic tools used to build Debian packages. Latest is a bit stretched since it embraces the last 2-3 years of development.

My talk covered the following topics:

  • Support of symbols files by dpkg-shlibdeps, dpkg-gensymbols
  • Support of new source formats by dpkg-source
  • Supplementary options for dpkg-source
  • Cross distribution collaboration with dpkg-vendor
  • Custom compilation flags with dpkg-buildflags
  • Miscellaneous improvements to other tools

The slides are relatively verbose so that you can understand them even if you did not attend the talk. Click here to get the slides.

Related links

This section points to various articles that cover more extensively some of the features mentioned in my talk.

Concerning dpkg-source:

Concerning dpkg-maintscript-helper:

Concerning dpkg-vendor: