Internationalization, and Pluralization with T

The object T is the language translator. It constitutes a single global instance of the web2py class gluon.language.translator. All string constants (and only string constants) should be marked by T, for example:

  1. a = T("hello world")

Strings that are marked with T are identified by web2py as needing language translation and they will be translated when the code (in the model, controller, or view) is executed. If the string to be translated is not a constant but a variable, it will be added to the translation file at runtime (except on GAE) to be translated later.

The T object can also contain interpolated variables and supports multiple equivalent syntaxes:

  1. a = T("hello %s", ('Tim', ))
  2. a = T("hello %(name)s", dict(name='Tim'))
  3. a = T("hello %s") % ('Tim', )
  4. a = T("hello %(name)s") % dict(name='Tim')

The latter syntax is recommended because it makes translation easier. The first string is translated according to the requested language file and the name variable is replaced independently of the language.

You can concatenate translated strings and normal strings:

  1. T("blah ") + name + T(" blah")

The following code is also allowed and often preferable:

  1. T("blah %(name)s blah", dict(name='Tim'))

or the alternative syntax

  1. T("blah %(name)s blah") % dict(name='Tim')

In both cases the translation occurs before the variable name is substituted in the “%(name)s” slot. The following alternative should NOT BE USED:

  1. T("blah %(name)s blah" % dict(name='Tim'))

because translation would occur after substitution.

Determining the language

The requested language is determined by the “Accept-Language” field in the HTTP header, but this selection can be overwritten programmatically by requesting a specific file, for example:

  1. T.force('it-it')

which reads the “languages/it-it.py” language file. Language files can be created and edited via the administrative interface.

You can also force a per-string language:

  1. T("Hello World", language="it-it")

In the case multiple languages are requested, for example “it-it, fr-ft”, web2py tries to locate “it-it.py” and “fr-fr.py” translation files. If none of the requested files is present, it tries to fall back on “it.py” and “fr.py”. If these files are not present it defaults to “default.py”. If this is not present either, it default to no-translation. The more general rule is that web2py tries “xx-xy-yy.py”, “xx-xy.py”, “xx.py”, “default.py” for each of the “xx-xy-yy” accepted languages trying to find the closest match to the visitor’s preferences.

You can turn off translations completely via

  1. T.force(None)

Normally, string translation is evaluated lazily when the view is rendered; hence, the translator force method should not be called inside a view.

It is possible to disable lazy evaluation via

  1. T.lazy = False

In this way, strings are translated inmediately by the T operator based on the currently accepted or forced language.

It is also possible to disable lazy evaluation for individual strings:

  1. T("Hello World", lazy=False)

A common issue is the following. The original application is in English. Suppose that there is a translation file (for example Italian, “it-it.py”) and the HTTP client declares that it accepts both English (en) and Italian (it-it) in that order. The following unwanted situation occurs: web2py does not know the default is written in English (en). Therefore, it prefers translating everything into Italian (it-it) because it only found the Italian translation file. If it had not found the “it-it.py” file, it would have used the default language strings (English).

There are two solutions for this problem: create a translation language for English, which would be redundant and unnecessary, or better, tell web2py which languages should use the default language strings (the strings coded into the application). This can be done with:

  1. T.set_current_languages('en', 'en-en')

It stores in T.current_languages a list of languages that do not require translation and forces a reload of the language files.

Notice that “it” and “it-it” are different languages from the point of view of web2py. To support both of them, one would need two translation files, always lower case. The same is true for all other languages.

The currently accepted language is stored in

  1. T.accepted_language

Translating variables

T does not only translate strings but it can also translate values stored in variables:

  1. >>> a = "test"
  2. >>> print T(a)

In this case the word “test” is translated but, if not found and if the filesystem is writable, it will add it to the list of words to be translated in the language file.

Notice that this can result in lots of file IO and you may want to disable it:

  1. T.is_writable = False

prevents T from dynamically updating language files.

Comments and multiple translations

It is possible that the same string appears in different contexts in the application and needs different translations based on context. In order to do this, one can add comments to the original string. The comments will not be rendered but will be used by web2py to determine the most appropriate translation. For example:

  1. T("hello world ## first occurrence")
  2. T("hello world ## second occurrence")

The text following the ##, including the double ##, are comments.

Pluralization engine

Since version 2.0, web2py includes a powerful pluralization system (PS). This means that when text marked for translation depends on a numeric variable, it may be translated differently based on the numeric value. For example in English we may render:

  1. x book(s)

with

  1. a book (x==1)
  2. 5 books (x==5)

English has one singular form and one plural form. The plural form is constructed by adding a “-s” or “-es” or using an exceptional form. web2py provides a way to define pluralization rules for each language, as well as exceptions to the default rules. In fact web2py already knows pluralization rules for many languages. It knows, for example, that Slovenian has one singular form and 3 plural forms (for x==2, x==3 or x==4 and x>4). These rules are encoded in “gluon/contrib/plural_rules/*.py” files and new files can be created. Explicit pluralizations for words are created by editing pluralization files using the administrative interface.

By default the PS is not activated. It is triggered by the symbols argument of the T function. For example:

  1. T("You have %s %%{book}", symbols=10)

or, less verbosely:

  1. T("You have %s %%{book}", 10)

Now the PS is activated for the word “book” and for the number 10. The result in English will be: “You have 10 books”. Notice that “book” has been pluralized into “books”.

The PS consists of 3 parts:

  • placeholders %%{} to mark words in the T input
  • rule to give a decision which word form to use (“gluon/contrib/plural_rules/*.py”)
  • dictionary with word plural forms (“applications/app/languages/plural-*.py”)

The value of symbols can be a single variable, a list/tuple of variables, or a dictionary.

A placeholder %%{} may consists of 3 parts:

  1. %%{[<modifier>]<word>[<parameter>]}

where:

  1. <modifier> ::= ! | !! | !!!
  2. <word> ::= any word or phrase in singular in lower case (!)
  3. <parameter> ::= [index] | (key) | (number)

Here are some possibilities:

  • %%{word} is equivalent to %%{word[0]} (if no modifiers are used).
  • %%{word[index]} is used when symbols is a tuple. symbols[index] gives us a number used to make a decision on which word form to choose. For example in:

    1. T("blabla %s %s %%{word[1]}", (var1, var2))

    PS will be used for “word” and “var2” respectively.

  • %%{word(key)} is used to get the number parameter from symbols[key] when symbols is a dictionary.
  • %%{word(number)} allows to pass a number directly (e.g.: %%{word(%i)}).

Other placeholders that also allow to pass a number directly are:

  • %%{?word?number} returns “word” if number == 1, returns the number otherwise.
  • %%{?number} or %%{??number} returns number if number != 1, return nothing otherwise.

You can use several %%{} placeholders with one index:

  1. T("%%{this} %%{is} %s %%{book}", var)

that generates:

  1. var output
  2. ------------------
  3. 1 this is 1 book
  4. 2 these are 2 books
  5. 3 these are 2 books
  6. ...

Similarly you can pass a dictionary to symbols:

  1. T("blabla %(var1)s %(wordcnt)s %%{word(wordcnt)}",
  2. dict(var1="tututu", wordcnt=20))

which produces:

  1. blabla tututu 20 words

You can replace “1” with any word you wish by this placeholder %%{?word?number}. For example:

  1. T("%%{this} %%{is} %%{?a?%s} %%{book}", var)

produces:

  1. var output
  2. ------------------
  3. 1 this is a book
  4. 2 these are 2 books
  5. 3 these are 3 books
  6. ...

Inside %%{...} you can also use the following modifiers:

  • ! to capitalize the text (equivalent to string.capitalize)
  • !! to capitalize every word (equivalent to string.title)
  • !!! to capitalize every character (equivalent to string.upper)

Notice you can use \ to escape ! and ?.

Translations, pluralization, and MARKMIN

You can also use the powerful MARKMIN syntax inside translation strings by replacing

  1. T("hello world")

with

  1. T.M("hello world")

Now the string accepts MARKMIN markup as described in Chapter 5