------
List: Swedish GNU/LI List
Sender: Ulrich Drepper <drepper@ipd.info.uni-karlsruhe.de>
Subject: gettext-0.9.1
Date: Wed, 09 Aug 1995 16:17:48 +0200
------
Hi folks,
This is not the announcement for yet another new official release but
I implemented something I would like to have more comments about. We
discussed this change a lot in a small group but now I would like to
know what the real world thinks about this.
The changes eliminate a restriction many people feel uncomfortable
with. It shows up in many ways.
1. No way to cleanly have two or more dialects of the same language.
E.g. norwegian users might want to have both of their two official
dialect available but the official name
no_NO.ISO-8859-1
leaves no room for the distinction.
2. Related to the above. Often the messages of a dialect (or
subclass) of a language is only different in some messages. Most
are often shared. So it does not make sense to have the translations
repeated in the more specific catalog.
(This is what I call message inheritence.)
3. Again related. When there are no differences in the message catalogs
for, say de_CH.ISO-8859-1 and de_DE.ISO-8859-1, why should I have two
catalogs installed? Both should default to a catalog with the simple
name `de' or `de.ISO-8859-1'.
4. The default language is always English. What I mean here is that
when the catalog for the needed domain is not available no translations
will be made and the program speaks English to you (at least inside
the GNU project this will always be the case).
I was told that many people in Sweden talk better German than English
and so would like to have a specification like this:
If the catalog is available in Swedish, I'll use it. If not
try to find a German catalog. If even this is not available
fall back on the default.
5. The cryptic names like de_CH.ISO-8859-1 are not user friendly.
Instead we just want to say something like `german' or `french'
to select the right language.
Of course all this extensions must not conflict with the POSIX standards.
The solution we choose is as follows (and it is implemented in
gettext-0.9.1):
We have a new environment variable `LANGUAGE' which contains a colon
separated list of locale specification of the form
language[_territory[.codeset]][@modifier]
This syntax comes from the X/Open Portability guide, volume 3. The
POSIX standards does not say anything about the form of the values for
LC_ALL, LC_MESSAGES, and LANG so we can use this form for these
variables, too.
To determine the locale the following list of ordered possibilities
applys:
Prio GNU extension POSIX value type
-------------------------------------------------------
^ LANGUAGE list
|
| LC_ALL LC_ALL single
|
| LC_MESSAGES LC_MESSAGES single
|
| LANG LANG single
You see the full POSIX behaviour is conserved. Only when the
environment variable LANGUAGE is defined the new behaviour is
selected.
XPG3 does not say what the modifier is used for (only gives a vague
example) so we are free to use it here. In my proposals I use this to
overcome the problem #1 above. In Norway you can you
no_NO.ISO-8859-1@bokmal
or
no_NO.ISO-8859-1@nynorsk
(please forgive me when this is not written correct).
The second and third problem can be overcome using the structure of
the locale names. The name no_NO.ISO-8859-1@bokmal can be exploded
into four parts (somewhat like X.400 :-):
language = no
territory = NO
codeset = ISO-8859-1
modifier = bokmal
If we now look for the messages catalog for the locale
no_NO.ISO-8859-1@bokmal
and this is not found, we go on by examining if any of
no_NO@bokmal
no.ISO-8859-1@bokmal
no@bokmal
no_NO.ISO-8859-1
no_NO
no.ISO-8859-1
no
is found (in this order). If even the last catalog is not found we go
on by examinig the next entry in the value of LANGUAGE. Remember:
this is a list on colon separated entries. Now the swedish user
mentioned above could have LANGUAGE set to the value
sv_SE.ISO-8859-1:de_DE.ISO-8859-1
and he would get what the informal specification above tells. Please
note that this process works on a per-language basis. It seems not to
be reasonable to switch the language when a single message is not
contained in a catalog. Once the language is chose (as to the first
found in the list) this remains to be used.
But point #3 above asks for some inheritence on catalogs of the same
language. This is also implemented but as said only the less specific
variants of the currently use catalogs are examined. Example:
A message is not translated in the catalog for locale
de_DE.ISO-8859-1
Now instead of returning immediately the untranslated message the
function tries to locate the catalogs for
de_DE
de.ISO-8859-1
de
in this order and examines whether this contain the string in
question. Remember the example mentioned above: Most strings have a
common translation (possibly located in de.ISO-8859-1). But some are
special for the swiss locale
de_CH.ISO-8859-1
Using this mechnism only the message in question has to be contained
in the later catalog.
Now to point #5. This problem was already solved in the X Window
System and so I reused the method. A simple "data base" maps locale
names to locale names. (Commonly this file is found as
/usr/lib/X11/locale/locale.alias
in system using X). When this file now contains a line like
french fr_FR.ISO-8859-1
we could set LANGUAGE to `french'.
****
As said all this is implemented in gettext-0.9.1. You can find this
on the alpha server of the GNU projects (those who know this know
where to look) or else on
i44ftp.info.uni-karlsruhe.de:/pub/gnu
It is not necessary to report warnings for this version because this
is an alpha version, not very much tested or cleaned up. Of course I
would like to hear about compilation errors.
The path for the alias file is by now simply hardcoded in the Makefile.
Please change it four your X installation. I'm also looking forward
for porposals how to make this portable.
And now to my final wish. Please let me know what you think. I need
some facts when I have to go into the final discussion with the GNU
representatives about this things. Even saying `I like this', `What in
hell should this be good for' could help. A comment is even better...
When there is some interest in discussing this things we could change
to the gnu newsgroup.
Thanks for reading,
-- Uli
________---------------------------------------------------------------
\ / Ulrich Drepper / Univ. at Karlsruhe, Germany / CS Dept. / IPD
L\inux/ email: drepper@gnu.ai.mit.edu smail: Rubensstr. 5
\ / drepper@ipd.info.uni-karlsruhe.de 76149 Karlsruhe
\/1.3.16 ------------------------------------------ Germany --------
Arkiv genererat av hypermail 2.1.1.