If you’re not into packaging and if you asked how you could help Debian, someone probably suggested that you help to translate it.
It’s true that translating Debian is essential if we want to make Debian available to everybody on the world. There are many persons who are stuck as soon as they get a message in English, so it’s important to aim for 100% coverage in terms of localization.
Some vocabulary: localization vs internationalization
Internationalization (i18n) is the work that makes it possible to translate messages in a given application.
Localization (l10n) is the work of translating messages of said application. So as a translator, you’ll be doing “localization” but some knowledge of “internationalization” is still useful… because it will define how you’re supposed to provide the translations. We’ll come back to that later.
Join your localization team
Usually the translation work is shared among multiple translators within a localization team. Check out the Debian International page on www.debian.org to find out instructions for translators for each language.
Many teams have a debian-l10n-*@lists.debian.org mailing list used for coordination, feel free to ask questions on those lists when you start (but make sure that you have read the relevant documentation before).
Each team has its own workflow, so observe for a while to get used to what’s happening before asking your first questions.
What is there to translate?
The translation of most of the software provided by Debian is not handled by Debian. The Debian translation teams “only” handle the translation of:
- the software that are specific to Debian (debian-installer, dpkg, APT, etc.) (*);
- the Debconf prompts in all Debian packages (*);
- the Debian documentation (*);
- the Debian website;
- the Debian wiki;
- the descriptions of packages.
Now before contributing to your first translation, I have to come back to internationalization to teach you a few things. In the above list, the projects marked with “(*)” do use PO files for their translation and the next sections will explain you how to work with those files.
Introduction to Gettext
The free software community has mostly standardized on a single internationalization infrastructure known as Gettext. With this tool, you’re provided a “POT file” which contains all the translatable strings. It looks like this:
# SOME DESCRIPTIVE TITLE. # Copyright (C) YEAR Software in the Public Interest, Inc. # This file is distributed under the same license as the PACKAGE package. # FIRST AUTHOR, YEAR. # #, fuzzy msgid "" msgstr "" "Project-Id-Version: dpkg 1.16.1\n" "Report-Msgid-Bugs-To: debian-dpkg@lists.debian.org\n" "POT-Creation-Date: 2011-09-23 03:37+0200\n" "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" "Last-Translator: FULL NAME \n" "Language-Team: LANGUAGE \n" "Language: \n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: nplurals=INTEGER; plural=EXPRESSION;\n" #: lib/dpkg/ar.c:66 #, c-format msgid "invalid character '%c' in archive '%.250s' member '%.16s' size" msgstr "" #: lib/dpkg/ar.c:81 lib/dpkg/ar.c:97 lib/dpkg/ar.c:108 lib/dpkg/ar.c:112 #: lib/dpkg/ar.c:134 utils/update-alternatives.c:1154 #, c-format msgid "unable to write file '%s'" msgstr "" […]
The lines starting with “#:” are comments that indicate the source files where the (English) string is used. This can be useful if you want check the source to have more information about how the string is used.
The lines starting with “#,” contain flags that can be important. If the “fuzzy” flag is set, the translated string is not used because it must be updated (or at least verified) since the original string evolved. The “c-format” flags indicates that the string must be a C format string, this has some implications in what’s allowed in the string (in particular when it embeds conversion specifier for arguments submitted to printf-like functions).
Another thing to note is that the translation of the empty string is used to store some meta-information about the translation itself.
Contributing a translation as a PO file
When you start a new translation, you copy that POT file to create a “PO file” for your own language (eg. fr.po for the French language). You replace some template values (identified with the upper case words in the POT file) and you replace all the empty strings on “msgstr” lines with the translation of the string that appears in the previous “msgid” line.
The result could be something like this:
# translation of fr.po to French # Messages français pour dpkg (Linux-GNU Debian). msgid "" msgstr "" "Project-Id-Version: fr\n" "Report-Msgid-Bugs-To: debian-dpkg@lists.debian.org\n" "POT-Creation-Date: 2011-09-23 03:37+0200\n" "PO-Revision-Date: 2012-01-16 07:57+0100\n" "Last-Translator: Christian Perrier\n" "Language-Team: French \n" "Language: fr\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Plural-Forms: Plural-Forms: nplurals=2; plural=n>1;\n" "X-Generator: Lokalize 1.2\n" #: lib/dpkg/ar.c:66 #, c-format msgid "invalid character '%c' in archive '%.250s' member '%.16s' size" msgstr "caractère invalide « %1$c » dans la taille du membre « %3$.16s » de l'archive « %2$.250s »" #: lib/dpkg/ar.c:81 lib/dpkg/ar.c:97 lib/dpkg/ar.c:108 lib/dpkg/ar.c:112 #: lib/dpkg/ar.c:134 utils/update-alternatives.c:1154 #, c-format msgid "unable to write file '%s'" msgstr "impossible d'écrire le fichier « %s »" […]
If there’s already a “PO file” for your language, there might still work to do: there might be strings that have not yet been translated and there might be “fuzzy” strings which have to be updated — strings which were already translated but where the original string has been modified.
There are software that can assist you to edit PO files: poedit, virtaal, lokalize, gtranslator. There are also special extensions for vim (packaged in vim-scripts) and for Emacs.
Submit the translation for inclusion
Once you have a complete PO file, you should submit it for inclusion. Sometimes you will have been granted commit rights to the source code repository so that you can include your translation by yourself. In the other cases, you should submit your translation with a bug report tagged “l10n” and someone else will include your work in the next release.
Depending on the team, the workflow might require a review before the submission. In that case, you usually have to send a call for review on the coordination mailing list.
Go ahead!
Hopefully those explanations will be enough to get you started. There are many other things to learn¹ but it’s good to learn while practicing…
¹ For example, can you find out why the French translation above changed “%c” in “%1$c”?
Do you want to read more tutorials like this one? Click here to subscribe to my free newsletter, you can opt to receive future articles by email.