Re: unicode

Alex Belits (abelits@phobos.illtel.denver.co.us)
Thu, 14 May 1998 21:38:25 -0700 (PDT)


On 14 May 1998, Jan Vroonhof wrote:

> >
> > Different byte sequences are different filenames, no matter that
> > they can mean the same glyphs.
>
> This is really really ugly. Suppose the shell is using encoding A and
> the file system is using B

I am talking about what filenames are equal and what are not. It's
assumed that files and filenames shouldn't be re-encoded automatically.

>
>
> > ls | On disk
> a | ?b?a
> b | ?b?b
> c | ?b?a
> > cat a | i.e. this tries to open '?a?a'
> a: No such file.
>
> To get around such problems you either need to have a single
> representation of each glyph which is basically what unicode is (One
> could of course use a mix of encoding labels with restrictions on what
> encodings can be used, but that is worse).

Glyphs mean nothing when filenames are compared, and if no one
re-encodes filenames inconsistently such problems don't exist. Things
of this kind happen when software is trying to be smarter than its user,
and re-encodes filesystems. If filesystem can be accessed with some
filenames translation there shouldn't be any means to access the same
files with different translation of filenames but the same content.

> P.S. Your time/date is off.

I know, I've just installed this box.

--
Alex

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu