Re: [OFFTOPIC] Re: unicode (char as abstract data type)

Dmitry Yaitskov (dima@interlog.com)
Wed, 22 Apr 1998 10:37:24 -0400


On Wednesday, Apr 22, Alex Belits (abelits@phobos.illtel.denver.co.us) spake thusly:
> On Tue, 21 Apr 1998, Pavel Machek wrote:
>
> > Really, you can not determine charset from language - because language
> > of *my* emails is sometimes something between czech and english. And
> > now imagine, me wanting to write russian word sabaka (or how is dog
> > written). I of course want to write it in azbuka. And I do not want to
> > tell my text editor origin of each word I use.
>
> If it will attach labels to character sequences, it will know them.
> English, being a language, supported as ASCII subset in non-ASCII charset
> can be used without separate labeling, but again, if necessary, switch
> between languages can be reflected in labeling.

If I understand you correctly, what you're saying is if I write my
mostly Russian mail in a text editor, and want to insert a Hebrew
word, my text editor should insert a label before it, and a
switch-back-to-russian label after? Keeping those labels invisible of
course? Or some such thing? In other words, make my perfectly normal
emacs *text* editor another ms-word-like monstrosity? And BTW, who and
how will decide how to display those chars, which font to use?

Mixing 2 non-ascii charsets in a single simple text document is not
possible using 8-bit charsets, and I don't think it is practicalbe to
do this using labels - too much PITA. And as far as I can see, this
can be done using unicode, and although probably not perfectly, in a
much more painless way... just my 2 kopecks... :)

-- 
Cheers,
 -Dima.

"He could be a poster child for retroactive birth control."

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu