> > From: erik@arbat.com (Erik Corry)
>
> > Most people have objections to decisions made in Unicode. This
> > is inevitable in a standard of this size, on a subject that
> > raises such emotions.
>
> Yes, this is true. But then, why was such a wide ranging standard
> imposed in the first place? What's the history here anyway? I get
That should be bloody obvious. People - _all_ people - hate dealing with
multiple character sets.
> the impression that the standards body didn't have a broad enough
> membership base. Thus, you have a few people trying to solve problems
All the national standards organizations in the world, not broad enough?
Ha.
> for themselves, and imposing their solution on everyone. A standard
> this size should take maybe 10-20 years to do right.
I'm glad you aren't on that commitee, then.
> > For that matter, ASCII is not in alphabetical order. For that
> > the order would have to be AaBbCcDdEeFfGgHhIi etc.
>
> Well then, I guess you don't mind doing text processing in EBCDIC!
> Sort order is very important.
Sort order is important. But cultural sort order (as opposed to any odd
sort order) _cannot_ be done via naked byte order and picking the right
character set. It's not even possible for English - you want to sort
Andy
boring
John
and no naked byte order will ever give you this.
> It is somewhat moot though. A standard that no one uses isn't a standard.
Well, as Unicode is definitely used in Windows (95, NT - and, thus, by
everyone using those systems), that doesn't seem to be a problem here.
It's used all right. (No, it's not used _only_ by Windows, but that alone
counts for a pretty large market segment.)
> Another ugly part is, you don't know what encoding most FS's actually
> use. That is, if you've got a file name on ext2fs, how do you know
> how to convert it to UTF-8? Or an imported ufs disk? What if ext2fs
> has some files in one encoding, and others in a different one?
That's why you want to standardize those on UTF-8. You _don't_ want to
have the FS have different names in different character sets.
Oh, btw, HFS+ (Apple's new FS to replace HFS) does use Unicode filenames
for exactly this reason ...
MfG Kai