Re: Comments on Microsoft Open Source documentA

Alex Belits (abelits@phobos.illtel.denver.co.us)
Sat, 7 Nov 1998 22:22:36 -0800 (PST)


On Sun, 8 Nov 1998, Michael Talbot-Wilson wrote:

> I hope that IETF protocols are not changed just because it appears
> that side issues may potentially be monstrously expanded to win
> more sales. The most unhealthy changes can be represented as
> essential to "meet customer demand".

The problem is, it's not always necessary to include explicit
proprietary stuff in the protocol without disclosing it. Sometimes it's
enough to define things in incomplete or ambiguous manner, thus forcing
all implementations to include nonstandard and proprietary parts. Or make
protocol design based on assumptions, true only if it's used with some
existing software/platform. Or including the as a part of protocol some
language with closed or semi-closed specification. Or patenting part of
protocol or algorithm, used to generate the only format, acceptable within
it (audio/video compression codecs are a nice example of this behavior).

In some cases it's explicit, in some cases it may be not. My example
with Unicode/UTF-8 applies to that for the following reason:

1. It's known that most of Microsoft software is designed with
nice-looking interface in mind, and not much else to speak of.

2. Multiple languages handling is a complex problem, and in general
includes issues like allowed text formatting, hyphenation rules, input
methods, alphabets ordering, phonetic matches, etc. Relatively easy part
of all that is displaying characters.

3. Different countries standardized their own charsets that are suitable
for their languages' handling, and developed various tools/libraries that
do something reasonable for formatting/hyphenation and other
language-specific issues. All charsets and languages now have their names
registered, however there is no complete language-support toolkit with
"pluggable" modules for languages, even though general infrastructure for
such things (locale support) exists for a long time. MIME provides basics
for labeling of both languages and charsets, and despite support of
displaying/handling being non-universal, most of software honors MIME
labeling and encoding.

4. Unicode makes a displaying problem non-issue (all characters are in
one huge font) at the price of modifying all string-handling routines.
That however includes complete incompatibility with existing charsets,
and lack of language-labeling.

5. Microsoft, having no chance to provide high-quality language handling
without creating monstrosities over monstrosities (localized versions of
Windows), solved the problem of displaying symbols from different
languages by adopting Unicode (in the form of UCS-2) internally. Weak
naming convention for fonts, existed before it (as opposed to
charset-labeled fonts in X11) and hopefully obsolete FAT filesystem that
had to be replaced anyway, helped in this decision.

6. Internet standards, mostly derived from MIME, accepted the idea of
charset/language labeling until the point when suddently the decision was
made to support Unicode. It was not UCS-2 because if such thing was
adopted, everything will be broken and won't be able to recover soon, so
this kind of change was impossible to force on anyone on whom anything
could depend. UTF-8 however had the advantage that it did not break
existing server software, even though at the moment it was almost
impossible to utilize it on clients other than made for Microsoft
systems (UCS-2 <-> UTF-8 conversion is trivial) for languages other than
English (however conversion of European iso8859-1 was trivial, too).
Opposition that came from users of other languages was ignored, in part
because none of them heard about first meetings where such policy was
established.

7. This change made impossible to continue the development of tools that
supported "obsolete" languages/charsets locale support. If any development
continued, people are either talking about strenghtening Unicode/UTF-8
support in X (crippling its existing locale / charsets / input methods
support), or tools -- Perl, for example, suddently converted to
Unicode/UTF-8, even though the data representation inside Perl allowed to
implement language/charset attributes as attached to strings without
breaking any compatibility with existing software -- attributes could just
"travel" with strings, ignored by existing routines, but providing
important information to new language-support and displaying routines. To
add more damage and insult, UTF-8 can't be handled well with regexps, used
widely in Perl, all kinds of Open Source and commercial Unix software but
unknown in Redmond. One of possible future ways to produce Open Source
software, superior to Microsoft by design -- with internationalization
support, based on extensible standards, with distinction between languages
when they share a charset, with possible automated language-dependent
processing in "obscure" typesetting, phonetic match and other issues -- is
now closed by nothing less than IETF standards. Open ones, BTW -- just
requiring to use technologically inferior and using significantly more
resources in all implementations, standard of internationalization.
Considering that in non-English-speaking and especially non-European
countries the position of Microsoft are the weakest, one can consider it
to be their major win.

8 (consequences example). Netscape Navigator/Communicator implemented
UTF-8 support in X11-based frontend in 4.x versions. Not only the text
looks ugly (character size of Roman and Cyrillic characters increased to
allow mixing them with large asian characters even if none of them are in
the text, but without language labeling noone can tell), and input
procedures don't even handle anything but iso8859-1. AFAIK Mozilla doesn't
look or works on X11 any better.

No patents, no lawsuits, no even formally closed standards -- just a
simple menipulation of standards committees to favor inferior solution,
but what an outcome!

--
Alex

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/