RE: GGI, EGCS/PGCC, Kernel source

Jordan Mendelson (jordy@wserv.com)
Fri, 27 Feb 1998 00:48:23 -0500


> > Oops.. The rest of the kernel uses UTF8, which covers more symbols that
> > Unicode16 and is generally more compact.
>
> I don't think this is technically correct, is it? UTF8 is an *encoding*
> for unicode; it covers the same symbols while retaining compatibility with
> standard ASCII (special prefixes flag long characters). Or am I wrong, is

UTF-8 simply says that if you don't need 2 bytes to represent a single
character, you can use only one (basically ASCII). It is an encoding.
Unicode on the other hand is just a really big map between character codes
and the actual character itself:

U+0041 == Capital A

However, U+0041 can be encoded as UTF-8.

Please read:

http://www.hut.fi/u/jkorpela/chars.html

It should clarify the difference between a character code, character
repertoire, and character encoding.

You really can't say UTF-8 is better than Unicode. It doesn't make any
sense.

Jordan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu