Re: unicode

Jan Vroonhof (vroonhof@math.ethz.ch)
14 May 1998 14:59:48 +0200


Alex Belits <abelits@phobos.illtel.denver.co.us> writes:

> > Isn't the biggest problem about charset labeling the fact that there
> > exist multiple encodings for the same character? As the kernel needs
> > to compare filenames for equality this then requires knowledge about
> > the charactersets used.
>
> Different byte sequences are different filenames, no matter that
> they can mean the same glyphs.

This is really really ugly. Suppose the shell is using encoding A and
the file system is using B

> ls | On disk
a | ?b?a
b | ?b?b
c | ?b?a
> cat a | i.e. this tries to open '?a?a'
a: No such file.

To get around such problems you either need to have a single
representation of each glyph which is basically what unicode is (One
could of course use a mix of encoding labels with restrictions on what
encodings can be used, but that is worse).

Jan

P.S. Your time/date is off.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu