Re: Internationalizing Linux

Horst von Brand (vonbrand@sleipnir.valparaiso.cl)
Sun, 06 Dec 1998 14:03:04 -0400


"John R. Lenton" <lenton@famaf.fis.uncor.edu> said:
> I'll try and summarize (and add a few extra thoughts):

> + Nobody wants to have to understand yet another [dozen]
> language[s] just to help Linux.
>
> This works both ways: an english programmer shouldn't
> have to try and decipher a bug report in spanish, and
> a programmer from Maldives shouldn't have to learn
> english to contribute. This has been the case so far,
> but it isn't really the way to go. Or is it? That is to
> say, either we have status quo and people who already
> know english will Linux and people who don't won't, or
> we change stuff around and gain a wider target. This is

Wrong. People that know enough to be able to work at Linux understand
English, if only because the relevant literature (manuals, papers, books,
even the source itself; not to mention the newsgroups and mailing lists
where the development takes place) is written in English. As long as this
doesn't change radically, internationalization is pretty irrelevant. Sorry
but true. Translations are hard to come by, only textbooks appear
translated to Spanish some three to four years after the original edition,
and that only if the original book is a huge success. Most translations
suck, a few suck very hard. A tiny fraction is up to snuff, in my
experience. Not to mention that many technical terms are tranlated
differently by different people.

Then think the other way around: I've learned CSc from books and papers
written in English (mostly because there is not much else), I stare at code
written in English, read the manual in English, and have to translate what
I see there into Spanish and back to understand error messages.

> + For linux to be widely accepted beyond US and EU,
> there should be a way to translate the
> different messages into something intelligible
> by the targeted end user.
>
> There are three ways of doing this: on the fly using
> LANG= and some trick on klog, a plain translated klog,
> or in the kernel. The first is the most flexible (per
> user) but also the slowest and most difficult. The
> first two don't offer translation of boot messages
> nor panics. The last two don't offer customization per
> user but only per machine.

The first one (on the fly translation in userland) is the only real
solution, IMHO. And, as you say, it is extremely hard to do. With the added
burden that just a handfull of people will be able to understand messages
in German or Spanish, so there has to be a _guaranteed_ easy way to
translate back, forever (Not because I'm running RedHat-7.5 in the distant
future should I be unable to translate back messages in Swahili generated
by an ancient kernel on Debian-2.3)

> If we do things right, said end user will be able to
> explain to his sysadmin what's going on (using a report
> form for the system or however), the sysadmin will then
> take the *code* of the error message and look it up in
> english, and report that error. This could even be done
> by the end user if (s)he is the sysadmin at the same
> time. This is

What do you do with messages that include variable parts? %%BROKEN_DISK
that translates to "Disk %d is acting up, replace %s as soon as possible"
is rather useless...

> o There needs to be some standard message identifier if
> this is going to go at all.

Exactly! It's called "Standard phrase in English" right now. Works fine
(If you look, most people that are concerned by this are from the USA, in
other parts of the world an (even limited) knowledge of English is part of
everybody's secondary schooling, at least).

[...]

> However, there is one big disadvantage:
>
> - Many maintainers have said they will not be bothered
> with this. "It'll be a maintainer's nightmare"
> seems to be the main reaction.

Exactly.

Case closed.

Please, this is discussed to death each year, the conclusion each time is
that the enormous effort isn't going to help any, it will just hinder
development for nothing. Note that all this "standard message identifier"
stuff _has_ been tried, over and over (IBM, DEC's VMS, ...) and it hasn't
worked. Not even for English alone.

> It needn't be. Because the maintainer or programmer or
> developer who actually writes the important stuff that
> gets the job done would just write the messages as a
> unique code (based on...?), and include in a separate
> file the messages in his own language.

And nobody will bother (or dare) translate them, so I get to see messages
in English (written by Alan Cox), in Swedish (by Linus), some in Italian
(by Andrea), or Bill Hawes in German, and Alexey in Russian... boy, will
_that_ be fun.

Or worse: I get nicely formatted messages in German, that some kind soul
translated. Perhaps without a clue of what was going on inside the code,
so she perhaps completely misunderstood the intent...

> Be that whatever,
> there will always be a guy/girl who know that language
> and english and linux, and who can translate it into
> english, and then the code can be internationalized.
> Or into spanish, then english. This isn't literature so
> we won't be loosing cadenzas on each translation :).

Worse. In literature a lost cadenza is lost beauty, in technical literature
a lost cadenza is lost content, maybe critical content. Besides, just as in
literature a great writer is required to do a first rate translation, a
first rate hacker will be required to translate technical content
faithfully. I for one would prefer her to work on the kernel.

-- 
Horst von Brand                             vonbrand@sleipnir.valparaiso.cl
Casilla 9G, Viņa del Mar, Chile                               +56 32 672616

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/