Re: unicode (char as abstract data type)

Kai Henningsen (kaih@khms.westfalen.de)
18 Apr 1998 13:21:00 +0200


abelits@phobos.illtel.denver.co.us (Alex Belits) wrote on 17.04.98 in <Pine.BSI.3.95.980417160122.7142I-100000@es1840.genesyslab.com>:

> On 17 Apr 1998, H. Peter Anvin wrote:

> > UTF-8 is actually very well done given the constraints imposed on it.
> > Yes, it's a compromise, but it had to be.
>
> Charset labeling is a compromise. Unicode is a decision of

Charset labelling is a non-solution.

> non-representative committee, imposed on everyone else by lazy software
> vendors who don't want to do language specific processing, but want to
> label their products as "internationalized".

Complete and utter bullshit.

First, Unicode doesn't claim to solve the language-dependant processing
problem. Read the fucking standard. It explicitely says you still have to
do it. What it does is enable you to do all the language-independant
processing to text in any language. And it does that very well, far better
than any other solution.

Second, Unicode is the exact same character set as ISO 10646, which has
more than enough representation from everywhere on the world.

Don't lie.

> > Now, with 8-bit charsets being common, people living in
> > countries where 8 bits are enough (especially ISO 8859-1 countries)
> > are whining about the complexity of supporting more than 8 bits.
>
> AFAIK, people who always used more than 8 bits are not the biggest
> proponents of Unicode either -- europeans (iso8859-1) and americans
> (us-ascii) are.

Not really. And it's only partly the "8 was always enough"; what it really
is is "I have something that works for my language, what the fuck do I
care that it doesn't work for most other languages". Problem is, that
attitude is not working very well any more.

> > I really would hate to see Linux falling behind in this area.
>
> I will rather prefer handling of national alphabets to be done by people
> who use them in their everyday life. Otherwise there will be a lot of
> pissed off people and unusable software.

Guess what? Unicode/ISO 10646 was designed by people using national
alphabets in their everyday life.

And pre-Unicode software generally is pretty much unuseable wrt. lots of
national alphabets.

MfG Kai

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu