Re: unicode (char as abstract data type)

Raul Miller (rdm@test.legislate.com)
Fri, 17 Apr 1998 21:20:59 -0400

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: David S. Miller: "Re: Counting System Calls"
Previous message: Guan Yang: "Re: Euro symbol"
In reply to: Raul Miller: "Re: unicode (char as abstract data type)"
Reply: Raul Miller: "Re: unicode (char as abstract data type)"

Albert D. Cahalan <acahalan@cs.uml.edu> wrote:
> Think about the consequences of UTF-8 at the system call level:
> Every system call that uses text must be first converted to UTF-8.
> This burden is with us forever. Meanwhile, Windows and MacOS can
> avoid conversion costs after the world converts to UCS2.

If there were any reason to do it, you could create a personality which
used (for example) UTF-16 instead. With a bit of work, you could write
a libc which could work properly with either the UTF-8 or the UTF-16
personality (a crude one, might perhaps use a boot mechanism which
points at a different ld.so, depending).

At this point, your applications wouldn't know what internal
representation the kernel used. You might even have it so that
some devices were utf-8 and others were utf-16 (so, yes, yet
another configuration parameter to get wrong).

Anyways, point is: you can change the interfaces, if you decide
it's worth the work. Right now, though, the only applications I've
used under linux which are unicode-aware are things derived from
plan-9, and these are perfectly happy with utf-8. (9fonts is also
the only unicode font (utf-2) I've got).

However, the last thing we want to do is implement utf-16 in the kernel
because microsoft/apple says that they're going to someday.

-- Raul

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu

Next message: David S. Miller: "Re: Counting System Calls"
Previous message: Guan Yang: "Re: Euro symbol"
In reply to: Raul Miller: "Re: unicode (char as abstract data type)"
Reply: Raul Miller: "Re: unicode (char as abstract data type)"