Re: Internationalizing Linux

Riley Williams (
Mon, 7 Dec 1998 18:06:17 +0000 (GMT)

Hi Alan.

>> I think having messages appear in the sysadmin's native language
>> has merit. The front line for Linux support is moving away from
>> this list.

>> Once messages permeate back to this list, they can be converted
>> back to English just as the original message was converted
>> originally to the native language.

> Im beginning to agree. Firstly someone says they have the tools to
> do this cleanly and automatically as well as intending to do it.
> Secondly providing they put numbering in you can look messages up.
> But they do need to provide a nice tool for this. If for non
> original messages you generate a constant numbering - and that bit
> is non trivial then there needs to be a way to do

> cat bugreport | toenglish | more

> that wil find all the numbered errors (eg format them $ID$ - text)
> and replace them with their "original" format.

Although English is my native tongue, and the only one I'm anywhere
near fluent in, I'd like to see this as well. As well as the bonus of
users seeing messages in their own language, there is the bonus that
the kernel will probably shrink due to all the messages moving out of
it, if it's done correctly.

Personally, I'd see this as the ideal job for a userland daemon, but
there is one slight problem with that: How to handle the messages
before the said daemon starts. Any suggestions, Alan?

However, if this is to be of any use, it would need to be distributed
with the kernel and supported by all of the subsystems...

> To my mind the constant numbering and also correct handling of
> positional data are the killer issues.

Neither should be a problem if done correctly. Here's how I would see
it implemented in pseudo-C:

Q> char Buffer[256];
Q> pipe( p1 );
Q> pipe( p2 );
Q> if ( fork() == 0 ) {
Q> dup2( p1[0], 0 );
Q> close( p1[1] );
Q> close( p2[0] );
Q> dup2( p2[1], 1 );
Q> /* NOTE 1 */
Q> execlp( "knlmsgd", "knlmsgd", NULL );
Q> perror( "ERROR: Failed to start kernel message daemon" );
Q> puts( "NOGO" );
Q> exit( 1 );
Q> }
Q> close( p1[0] );
Q> close( p2[1] );
Q> read( p2[0], Buffer, 256 );
Q> if ( !strcmp("OK\n",Buffer) ) {
Q> dup2( p1[1], 4 );
Q> dup2( p2[0], 5 );
Q> knldmn = 1;
Q> } else {
Q> knldmn = 0;
Q> close( p1[1] );
Q> close( p2[0] );
Q> }
Q> /* NOTE 2 */

NOTE 1: At this point, the daemon sits at the far end of a pair of
pipes, with messages from the kernel appearing on stdin
and its responses to the kernel being sent to stdout. The
daemon should internally send any error messages direct to
syslogd for recording via the standard kernel mechanisms.

NOTE 2: At this point, the kernel (or controlling program) would
be able to look at the flag 'knldmn' to see whether it had
a valid kernel message daemon available, and if so, knows
that it can refer to it via standard file numbers 4 and 5
for the daemon's stdin and stdout respectively.

The daemon would then listed for commands on its stdin and respond
accordingly. The following commands would be available as a basic
minimum, with others as required:

Tells the daemon to shut down gracefully. SIGHUP should
achieve the same result.

LANG filespec
Specifies the language definition file to use.

MSG code parameters
Requests the message identified by 'code' with positional
parameters replaced by the supplied strings. Each code
would define a standard meaning for each positional
parameter, enabling the language definition files to be
adjusted accordingly.

As regards the language definition files, I see those as standard text
files, with the relevant code as the first 'word' on each line, and
the associated text following it, separated by whitespace.

The codes could be just about anything, providing they were retained
and used in every definition file. Probably a useful format would be
to have them begin with a mnemonic specifying the kernel subsystem
they relate to, and follow this by an error number within that

Given the above specification, it would be possible to write a tool
that took as parameters two language specifications, and as input on
stdin one or more messages in the first language, and produce on
stdout a translation of those messages in the second language, and it
shouldnae care what the languages in question are...


In fact, I've basically written a bash shell script that implements
the above specification, together with initial language files for
English, French and Spanish, and reserving the mnemonic LANG for items
related directly to this subsystem. The following is the English
language file thereof:

Q> LANG0000 Kernel Language File
Q> LANG0001 English
Q> LANG0002 No
Q> LANG0003 Yes
Q> LANG0004 Error
Q> LANG0005 Warning

These lines are as follows in this script:

LANG0000 This line MUST be the first line of the file, and
is used to identify this file as a kernel language
file. It should be copied verbatim, NOT translated.

LANG0001 The name of the language as written in that language.

LANG0002 The word used for "No" in the language.

LANG0003 The word used for "Yes" in the language.

LANG0004 The word used for "Error" in the language.

LANG0005 The word used for "Warning" in the language.

Some of these may be unnecessary, but they are likely to occur
regularly. However, they're mainly included as an example.

Other mnemonics I could see being used:

ATALK Appletalk subsystem.
AX25 AX.25 subsystem.
EXT2 ext2 file system.
INIT INITialisation code.
IPC Inter Process Communication subsystem.
IPV4 IP v4 subsystem.
IPV6 IP v6 subsystem.
IPX IPX subsystem.
MM Memory Management subsystem.
NETROM AX.25 NETROM subsystem.
NFS Network File System.
PROC /proc file system.
ROSE AX.25 Rose subsystem.
VFS Virtual File System.

Then there's the architecture-specific subsystems:

ALPHA Alpha specific.
I386 i386 specific.
M68K m68k specific.
MIPS MIPS specific.
PPC Power-PC specific.
SPARC SPARC specific.

Comments, anybody?

Best wishes from Riley.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
Please read the FAQ at