Re: unicode (char as abstract data type)

Alex Belits (abelits@phobos.illtel.denver.co.us)
Fri, 17 Apr 1998 19:13:11 -0700 (PDT)


On Fri, 17 Apr 1998, Albert D. Cahalan wrote:

> >
> > Neither Sun, nor Apple or Microsoft have really converted anything
> > to Unicode.
>
> Microsoft uses Unicode in the kernel calls. Note that the C library
> can still support dumb 8-bit apps as well as any other C library.

See below what it does with those calls.

>
> > Having NTFS filesystem where filenames are already for many years
> > supposed to be in Unicode, but used by all software with the
> > assumption that only 8 bits of every character matter doesn't
> > mean much, so this direction is dead, too.
>
> It is not dead. The Unicode support in the system allows for a
> future world without 8-bit apps. The transition may take a decade.
> When the transition is done, there won't be so much reencoding
> between apps and the kernel.

In a decade Unicode most likely will be in the same place where EBCDIC
is now.

> >> I certainly don't want to see 8-bit kernel calls on Merced.
> >
> > Then you won't see vi there either.
>
> Oh? The last time I heard, vi accessed system calls via libc.

But in what encoding will it represent that to the terminal?

> Very few apps care about the kernel interface. I can think of
> strace and maybe gdb.
>
> With the right libc, you could even pretend the kernel used
> UTF-8 for the system calls.
>

[skipped]

> That is the applications. This is the kernel mailing list.
> We have a library called "libc" that provides an interface
> between the applications and the kernel. Applications can
> still see filenames in KOI-8 if you so desire. (you won't
> care what libc does to non-KOI-8 filenames because you won't
> have any such names on your disk)

How will it know that it's koi8 if charset labeling will be eliminated
(and this is the whole point of Unicode -- to avoid need of charset
labeling by providing some flat space)?

> Think about the consequences of UTF-8 at the system call level:
> Every system call that uses text must be first converted to UTF-8.
> This burden is with us forever. Meanwhile, Windows and MacOS can
> avoid conversion costs after the world converts to UCS2.
>
> The world _will_ convert too. As much as you may hate it, you
> must realize that when Sun, Microsoft, and Apple agree...
> It is only a matter of time -- perhaps a decade.

They say it, but they don't _do_ it -- and they can't do that anyway.

--
Alex

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu