I18n of Bourne Shell Scripts Using „gettext.sh“

This article shows how to make a Bourne Shell script translatable by means of the „gettext“ toolchain and how it then operates with natural language strings translated into the user’s language.

Prerequisites

To follow the examples in this article, the following software (in terms of Debian packages) should be installed:

  • gettext
  • dialog

A Sample Script

Consider this Bourne Shell script myscript performing an interaction using the dialog executable:

#!/bin/sh

foo="bar"

if ! dialog --yesno "WARNING:

foo=$foo

Do you want to continue?" 15 60 ; then
    echo 'aborted (user request)' >&2
    exit 1
fi

As you can see, the dialog command is performed with a complex message involving multiple lines and a variable substitution of $foo.

In case of dialog returning a false status, a message is emitted to the standard error stream, and the program is exited.

Involving gettext

Including the system-wide helper script gettext.sh (which should be in $PATH and usually is located in /usr/bin) will provide some useful shell functions described later on.

The $TEXTDOMAIN variable should be set to tell „gettext“ what specific translation file to read translations from. „Gettext“ will then read translations from a machine file

/usr/share/locale/$user_locale/LC_MESSAGES/$TEXTDOMAIN.mo

where $user_locale could be en, en_US (which falls back to en if a directory en_US is not found), de_DE (which falls back to de), de_DE.UTF-8 (which falls back to de_DE, and, if that is not found, to de) and so forth.

If, after searching all available locale directories, such an .mo file was not found, the original untranslated string was used.

So I rewrite the beginning of the script to:

#!/bin/sh

export TEXTDOMAIN=myscript

. gettext.sh

foo="bar"
# ...etc

Preparing The Script For Translation

Overview

The essential activity when preparing a script for translation is to indicate to „gettext“ what strings I want to have translated.

I do this by replacing single- and double-quoted strings that contain natural language expressions with an invocation of either the gettext executable or the eval_gettext shell function (provided by gettext.sh).

Note: There are more functions available in gettext.sh than are demonstrated in this article, please see [GNU] for a complete  reference.

Simple Literal Strings

A simple echo command can be substituted by gettext (but note that gettext does not append a linebreak):

echo 'aborted (user request)'

becomes

gettext 'aborted (user request)' ; echo

I prefer the following version (which renders the same output):

echo $(gettext 'aborted (user request)')

, because I do not have to maintain spurious echo commands (that can easily be forgotten), and also I can redirect the entire output to the standard error stream in one go:

echo $(gettext 'aborted (user request)') >&2

Note: I originally planned to have gettext aliased to something more compact, such as

alias _=gettext # wrong!

but that does more harm than good, because the command xgettext recognizes translateable strings from being preceeded with gettext (or eval_gettext); an alias would not work, and no strings would be extracted by xgettext.

Quoting of Strings

In the case of the dialog command, which requires the text parameter of the --menu option to be quoted (so that the menu message text appears to dialog as one single argument), I write that as follows:

"WARNING blahblah"

becomes

"$(gettext "WARNING blahblah")"

Note the double quotes outside of the $(...) command substitution, they are required to let dialog see the result as a single argument, not broken up in multiple words (which would render a dialog usage error).

Variable Substitution

There is one more thing to take care of, and that is variable substitution, in this example the occurence of $foo in the dialog message string. „Gettext“ cannot use this; during extraction with xgettext (see below) a variable inside the string would render the variable into the extracted translatable string, which is something that „gettext“ can not do – message IDs must  be fixed. The string would then not be extracted at all.

The shell function eval_gettext can expand a variable specification that is escaped, meaning, if the dollar sign $ is preceeded with a backslash character \, eval_gettext will replace the occurence of \$foo with the actual substituted value of $foo (this is why it is called eval_gettext).

What all of that means is, I have to  use eval_gettext and replace $foo with \$foo in my message string:

echo "foo=$foo"

becomes

echo $(eval_gettext "foo=\$foo")

Another option is to use single quotes, this also prevents $foo from being substituted:

echo $(eval_gettext 'foo=$foo')

Complete Example

Finally, in context:

#!/bin/sh

export TEXTDOMAIN=myscript

. gettext.sh

foo="bar"

if ! dialog --yesno "$(eval_gettext 'WARNING:

foo=$foo

Do you want to continue?')" 15 60 ; then
    echo $(gettext 'aborted (user request)') >&2
    exit 1
fi

This shell script is now prepared for translation.

Performing The Translation

Step #1: Use the command xgettext to extract all translateable strings from the script and store them in a .pot file. From the commandline, in the directory where myscript is located, execute:

~$ xgettext -L Shell -o myscript.pot ./myscript

This results in a myscript.pot file.

Checking the result of Step #1: Opening the file in a text editor, after some comment-header, one can see that xgettext has successfully extracted the natural language strings from the shell script, and this also has worked out for the multiline message text for the dialog option --menu:

...

#: myscript:8
#, sh-format
msgid ""
"WARNING:\n"
"\n"
"foo=$foo\n"
"\n"
"Do you want to continue?"
msgstr ""

#: myscript:13
#, sh-format
msgid "aborted (user request)"
msgstr ""

Step #2: I can now proceed to use a copy of this .pot file for a German translation:

~$ mkdir -p locale/de/LC_MESSAGES
~$ cp myscript.pot locale/de/LC_MESSAGES/myscript.po

Step #3: Opening the file locale/de/LC_MESSAGES/myscript.po in a text editor, I fill out the stuff required in the header, most importantly I set CHARSET to the actual character encoding I want to use for my German special characters:

"Content-Type: text/plain; charset=UTF-8\n"

Now I actually translate, for example, the menu message string as follows:

#: myscript:8
#, sh-format
msgid ""
"WARNING:\n"
"\n"
"foo=$foo\n"
"\n"
"Do you want to continue?"
msgstr ""
"WARNUNG:\n"
"\n"
"foo=$foo\n"
"\n"
"Trotzdem fortsetzen?"

Step #4: When I am finished with the translations, I compile the .po file into an .mo file:

~$ msgfmt -o locale/de/LC_MESSAGES/myscript.mo locale/de/LC_MESSAGES/myscript.po

Testing A Translation

Now that the translations are compiled into the machine format, I can test the result.

„Gettext“ recognizes an environment variable $TEXTDOMAINDIR that specifies a different directory than /usr/share/locale for looking up .mo files. This allows me to test the translation without having to copy the message catalog into a  system directory:

~$ export TEXTDOMAINDIR=$PWD/lang
~$ export LC_ALL=de_DE.UTF-8
~$ sh myscript

The interaction is now translated.

Links