Re: Multibyte character support

Martin von Loewis (martin@mira.isdn.cs.tu-berlin.de)
Sat, 7 Feb 1998 01:34:32 +0100


> I seem to remember that the linux kernel point of view
> is that the kernel doesn't need to know about multibyte
> character sets (just staying out of the way and not disturbing
> the data stream is sufficient). The only exception seems to be
> eventual support for multibyte characters in file names, but
> at present it looks like this would break the posixness of ext2.
>
> Is this summary basically correct? Could a kind soul perhaps
> point me in the direction of useful information sources?

Basically, with slight modifications:
- not only the file names are affected, it is also important for the
system console (what character sequences are generated when you press
a key, how is the screen contents modified when bytes are send to
the terminal)
- Having multibyte file names does not break posixness, and is indeed
supported by ext2. Null bytes inside file names would break posixness,
but this is not really a problem as long as you have null-byte free
encodings. For ext2, the rule is 'use whatever byte sequence you had
at creat(2)'.
- Other file systems do know very well what character set the FS uses,
and they need to convert if necessary. Linux 2.1 has a conversion
framework, which is currently used by vfat/msdos, ntfs, and joliet.

Hope this helps,
Martin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu