Looking back at 16 years of dpkg history with some figures

With Debian’s 19th anniversary approaching, I thought it would be nice to look back at dpkg’s history. After all, it’s one of the key components of any Debian system.

The figures in this article are all based on dpkg’s git repository (as of today, commit 9a06920). While the git repository doesn’t have all the history, we tried to integrate as much as possible when we created it in 2007. We have data going back to April 1996…

In this period between April 1996 and August 2012:

  • 146 persons contributed to dpkg (result of git log --pretty='%aN'|sort -u|wc -l)
  • 6948 commits have been made (result of git log --oneline | wc -l)
  • 3133612 lines have been written/modified (result of git log --stat|perl -ne 'END { print $c } $c += $1 if /(\d+) insertions/;')

Currently the dpkg source tree contains 28303 lines of C, 14956 lines of Perl and 6984 lines of shell (figures generated by David A. Wheeler’s ‘SLOCCount’) and is translated in 40 languages (but very few languages managed to translate everything, with all the manual pages there are 3997 strings to translate).

The top 5 contributors of all times (in number of commits) is the following (result of git log --pretty='%aN'|sort| uniq -c|sort -k1 -n -r|head -n 5):

  1. Guillem Jover with 2663 commits
  2. Raphaël Hertzog with 993 commits
  3. Wichert Akkerman with 682 commits
  4. Christian Perrier with 368 commits
  5. Adam Heath with 342 commits

I would like to point out that those statistics are not entirely representative as people like Ian Jackson (the original author of dpkg’s C reimplementation) or Scott James Remnant were important contributors in parts of the history that were recreated by importing tarballs. Each tarball counts for a single commit but usually bundles much more than one change. Also each contributor has its own habits in terms of crafting a work in multiple commits.

Last but not least, I have generated this 3 minutes gource visualization of dpkg git’s history (I used Planet’s head pictures for dpkg maintainers where I could find it).

Watching this video made me realize that I have been contributing to dpkg for 5 years already. I’m looking forward to the next 5 years 🙂

And what about you? You could be the 147th contributor… see this wiki page to learn more about the team and to start contributing.

My Debian Activities in June 2012

This is my monthly summary of my Debian related activities. If you’re among the people who made a donation to support my work (168.12 €, thanks everybody!), then you can learn how I spent your money. Otherwise it’s just an interesting status update on my various projects.

Dpkg

This month, I resumed my work on dpkg. I concentrated my efforts on some “polishing” of the “3.0 (quilt)” format. With the latest version (1.16.6 — which was uploaded to unstable shortly before the freeze), dpkg-source restores the source tree in a clean state after a failed patch application (#652970), doesn’t overwrite the patch header from the pre-existing automatic patch, updates automatically debian/source/include-binaries during dpkg-source –commit, and supports a new –no-unapply-patches option for those who dislike the auto-unapplication at the end of the process when the patches were not applied at the start.

I wanted to go further and offer a new feature that could insert the automatic patch at the bottom of the quilt series but I have been short on time to complete this feature. I just managed to factorize all the quilt handling in a dedicated Perl module (Dpkg::Source::Quilt) to have cleaner code in the module handling the source format (Dpkg::Source::Package::V3::quilt).

For those who wonder, this feature is meant primarily for the X Strike Force team which maintains packages in Git and are doings lots of upstream cherry-picks (to fix regressions, etc.). But they also use quilt on top of that tree to keep some lasting Debian specific changes. With the 1.0 format, the “automatic diff” is a bit messy but at least it gets smaller automatically when a new upstream release gets out, there’s nothing to clean out. I’d like them to be able to use “3.0 (quilt)” while keeping their workflow. I’m leaning towards allowing “--auto-commit=first:cherry-picks” that would name the automatic patch “cherry-picks” and put it in the first position in the quilt series. (Opinions welcome on that feature, BTW)

Packaging

There’s been quite some packaging in this last month before the freeze:

  • I packaged CppUTest (a test framework for C/C++), and I wrote an article about it.
  • I prepared a stable update of Publican to fix a missing dependency. I also updated the unstable version to include a backport of a fix that some user requested me to include.
  • I updated dh-linktree to improve its documentation (following a discussion that happened on debian-devel) and to deal properly with trailing slashes in its input (#673408).
  • I sponsored dblatex 0.3.4-1 and ledgersmb 1.3.18-1.
  • I updated gnome-shell-timer to a new upstream snapshot that was tagged as compatible with GNOME 3.4 (#6776516).
  • I packaged wordpress 3.4 and spent a whole day triaging the old bugs that accumulated. A few days later I developed a new infrastructure to properly manage plugins/themes/language files. The canonical directory where the user is expected to drop his custom plugins/themes is now in /var/lib/wordpress/wp-content/ and the official plugins/themes are “installed” there with symlinks pointing back to /usr/share/wordpress/wp-content/ where they actually reside.
  • I wanted to commit 2 patches for the developers-reference but then I noticed that some translations were complete and were waiting for an upload. So I cleaned the packaging (switch to dh) and I uploaded version 3.4.8 before committing the patches for #678710 and #678712.

While doing all this packaging work, I found 2 possible improvements that I filed as bug reports:

  • #676606: debcommit should be able to identify alone that a new release is prepared (when the distribution field of the changelog changes from UNRELEASED to something else).
  • #679132: lintian outputs false positives for the tag package-uses-local-diversion when neither –local nor –package is given on the dpkg-divert command line.

Debian France Booth at Solutions Linux

From June 19th to June 21th, I manned the Debian France booth at Solutions Linux together with Carl Chenet, Tanguy Ortolo and other members of the association. We answered lots of questions, sold all t-shirts and umbrellas that Carl imported from Germany and Switzerland (we really need to get our own merchandising stuff produced in France!), got people to join the association. We also presented a printed copy of the Debian Administrator’s Handbook and of the corresponding French book.

You can see Carl, me and Tanguy on this picture (click on it to see a bigger picture, thanks to Sébastien Dubois of Evolix for this one!):

I know lots of people are preparing for Debconf but I decided to not attend this year, the price of the air plane ticket was a bit too hefty for me and it was also in partial conflict with our family vacations. I thought about attending the Libre Software Meeting instead but alas I won’t go there either (but Roland Mas will be there!), I have too much work to complete before my own vacation in 2 weeks.

Thanks

See you next month for a new summary of my activities.

My Debian Activities in March 2012

This is my monthly summary of my Debian related activities. If you’re among the people who made a donation to support my work (227.83 €, thanks everybody!), then you can learn how I spent your money. Otherwise it’s just an interesting status update on my various projects.

Dpkg

Thanks to Guillem, dpkg with multiarch support is now available in Debian sid. The road has been bumpy, and it has again been delayed multiple times even after Guillem announced it on debian-devel-announce. Finally, the upload happened on March 19th.

I did not appreciate his announce because it was not coordinated at all, and had I been involved from the start, we could have drafted it in a way that sounded less scary for people. In the end, I provided a script so that people can verify whether they were affected by one of the potential problems that Guillem pointed out. While real, most of them are rather unlikely for typical multiarch usage.

Bernhard R. Link submitted a patch to add a new –status command to dpkg-buildflags. This command would print all the information required to understand which flags are activated and why. It would typically be called during the build process by debian/rules to keep a trace of the build flags configuration. The goal is to help debugging and also to make it possible to extract that information automatically from build logs. I reviewed his patch and we made several iterations, it’s mostly ready to be merged but there’s one detail where Bernhard and I disagree and I solicited Guillem’s opinion to try to take a decision. Unfortunately neither Guillem nor anyone else chimed in.

On request of Alexander Wirt, I uploaded a new backport of dpkg where I dropped the DEB_HOST_MULTIARCH variable from dpkg-architecture to ensure multi-arch is never accidentally enabled in other backports.

One last thing that I did not mention publicly at all yet, is that I contacted Lennart Poettering to suggest an improvement to the /etc/os-release file that he’s trying to standardize across distributions. It occurred to me that this file could also replace our /etc/dpkg/origins/default file (and not only /etc/debian_version) provided that it could store ancestry information. After some discussions, he documented new official fields for that file (ID_LIKE, HOME_URL, SUPPORT_URL, BUG_REPORT_URL). Next step for me is to improve dpkg-vendor to support this file (as a fallback or as default, I don’t know yet).

Packaging

I packaged quilt 0.60 (we’re now down to 9 Debian-specific patches, from a whopping 26 in version 0.48!) and zim 0.55.

In prevision of the next upstream version of Publican, I asked the Perl team to package a few Perl modules that Publican now requires. Less than two weeks after, all of them were in Debian Unstable. Congrats and many thanks to the Perl team (and Salvatore Bonaccorso in particular, which I happen to know because we were on the same plane during last Debconf!).

On a side note, being the maintainer of nautilus-dropbox became progressively less fun over the last months, in particular because the upstream authors tried to override some of the (IMO correct) packaging decisions that I made and got in touch with Ubuntu community managers to try to have their way. Last but not least, I keep getting duplicates of a bug that is not in my package but in the official package and that Dropbox did not respond to my query.

Book update

The translation is finished and we’re now reviewing the whole book. It takes a bit more time than expected because we’re trying to harmonize the style and because it’s difficult to coordinate the work of several volunteer reviewers.

The book cover is now almost finalized (click on it to view it in higher definitions):

We also made some progress on the interior design for the paperback. Unfortunately, I have nothing to show you yet. But it will be very nice… and made with just a LaTeX stylesheet tailored for use with dblatex.

The liberation fundraising slowed down with only 41 new supporters this month but it made a nice bump anyway thanks to a generous donation of 1000 EUR by Offensive security, the company behind Backtrack Linux. They will soon communicate on this, hopefully it will boost the operation. It would be really nice if we managed to raise the remaining 3000 EUR in the few weeks left until the official release of the book!

The work on my book dominated the month and explains my relative inactivity on other fronts. I worked much more than usual, and my wife keeps telling me that I look tired and that I should go in bed earlier… but I see the end of the tunnel: if everything goes well, the book should be released in a few weeks and I will be able to switch back to a saner lifestyle.

Thanks

See you next month for a new summary of my activities.

My Debian Activities in February 2012

This is my monthly summary of my Debian related activities. If you’re among the people who made a donation to support my work (384.14 €, thanks everybody!), then you can learn how I spent your money. Otherwise it’s just an interesting status update on my various projects.

Dpkg and multiarch

The month started with a decision of the technical committee which allowed me to proceed with an upload of a multiarch dpkg even if Guillem had not yet finished his review (and related changes). Given this decision, Guillem made the experimental upload himself.

I announced the availability of this test version and invited people to test it. This lead to new discussions on debian-devel.

We learned in those discussions that Guillem changed his mind about the possibility of sharing (identical) files between multiple Multi-Arch: same packages, and that he dropped that feature. But if this point of the multiarch design had been reverted, it would mean that we had to update again all library packages which had already been updated for multi-arch. The discussions mostly stalled at this point with a final note of Guillem explaining that there was a tension between convenience and doing the right things every time that we discuss far-reaching changes.

After a few weeks (and a helpful summary from Russ Allbery), Guillem said that he remained unconvinced but that he put back the feature. He also announced that he’s close to having completed the work and that he would push the remaining parts of the multiarch branch to master this week (with the 1.16.2 upload planned next week).

That’s it for the summary. Obviously I participated in the discussions but I didn’t do much besides this… I have a “mandate” to upload a multiarch dpkg to sid but I did not want to make use of it while those discussions remained pretty unconclusive. Also Guillem made it pretty clear that the multiarch implementation was “buggy”, “not right” and “not finished” and that he had reworked code fixing at least some of the issues… since he never shared that work in progress, I also had no way to help even just by reviewing what he’s doing.

We also got a few multiarch bug reports, but I couldn’t care to get them fixed since Guillem clearly held a lock on the codebase having done many private changes… it’s not quite like this that I expect to collaborate on a free software project but life is full of surprises!

I’ll be relieved once this story is over. In the mean time, I have added one new thing on my TODO list since I made a proposal to handle bin-nmu changelogs and it’s something that could also fix #440094.

Misc dpkg stuff

After a discussion with Guillem, we agreed that copyright notices should only appear in the sources and not in manual pages or --version output, both of which are translated and cause useless work to translators when updated. Guillem already had some code to do it for --version strings, and I took care of the changes for the manual pages.

I merged some minor documentation updates, fixed a bug with a missing manpage. Later I discovered that some recent changes lead to the loss of all the translated manual pages. I suggested an improvement to dh_installman to fix this (and even prepared a patch). In the end, Guillem opted for another way of installing translated manual pages.

Triggered by a discussion on debian-devel, I added a new entry to my TODO list: implementing dpkg-maintscript-helper rm_conffile_if_owner to deal with the case where a conffile is taken over by another package which might (or might not) be installed.

Misc packaging

At the start of the month, I packaged quilt 0.51. The number of Debian specific patches is slowly getting down. With version 0.51, we dropped 5 patches and introduced a new one. Later in the month I submitted 4 supplementary patches upstream which have been accepted for version 0.60.

This new version (just released, I will package it soon) is an important milestone since it’s the first version without any C code (Debian had this for a long time but we were carrying an intrusive patch for this). Upstream developer Jean Delvare worked on this and based his work on our patch, but he went further to make it much more efficient.

Besides quilt, I also uploaded dh-linktree 0.2 (minor doc update), sql-ledger 2.8.36 (new upstream version), logidee-tools 1.2.12 (minor fixes) and publican 2.8-2 (to fix release critical bug #660795).

Debian Consultants

The Debian Project Leader is working on federating Debian Companies. As the owner of Freexian SARL, I was highly interested in it since Freexian “contributes to Debian, offers support for Debian and has a strategic interest in Debian”. There’s only one problem, you need to have at least 2 Debian developers on staff but I have no employees (it’s me only). I tried to argue that I have already worked with multiple Debian developers (as contractors) when projects were too big for me alone (or when I did not have enough time). Alas this argument was not accepted.

Instead, and since our fearless leader is never afraid to propose compromises, he suggested me (and MJ Ray who argued something similar than me) to try to bring life to the Debian Consultants list which (in his mind) would be more appropriate for one-man companies like mine. I accepted to help “animate” the list, and on his side, he’s going to promote both the “Debian Companies” and the “Debian Consultants” lists.

In any case, the list has seen some traffic lately and you’re encouraged to join if you’re a freelancer offering services around Debian. The most promising thing is that James Bromberger offered to implement a real database of consultants instead of the current static page.

Book update

We made quite some progress this month. There’s only one chapter left to translate. I thus decided to start with proofreading. I made a call for volunteers and I submitted one (different) chapter to 5 proofreaders.

The liberation campaign made a nice leap forwards thanks to good coverage on barrapunto.com. We have reached 80% while we were only at 72% at the start of the month (thanks to the 113 new supporters!). There’s thus less than 5000 EUR to raise before the book gets published under a free license.

Looking at the progression in the past months, this is unlikely to be completed on time for the release of the book in April. It would be nice though… so please share the news around you.

Speaking of the book’s release, I’m slowly preparing it. Translating docbook files is not enough, I must be able to generate HTML, ePub and PDF versions of the book. I’m using Publican for most formats, but for the PDF version Publican is moving away of fop and the replacement (webkit-based) is far from being satisfactory to generate a book ready for print. So I plan to use dblatex and get Publican to support dblatex as a backend.

I have hired Benoît Guillon, the upstream author of dblatex, to fix some annoying bugs and to improve it to suit my needs for the book (some results are already in the upstream CVS repository). I’m also working with a professional book designer to get a nice design.

I have also started to look for a Python Django developer to build the website that I will use to commercialize the book. The website will have a larger goal than just this though (“helping to fund free software developers”) but in free software it’s always good to start with your own case. 🙂

Hopefully everything will be ready in April. I’m working hard to meet that deadline (you might have noticed that my blog has been relatively quiet in the last month…).

Thanks

See you next month for a new summary of my activities.