Re: [2.6 patch] UTF-8 fixes in comments

From: Willy Tarreau
Date: Tue Apr 29 2008 - 01:06:28 EST


On Mon, Apr 28, 2008 at 06:29:43PM -0700, H. Peter Anvin wrote:
> Willy Tarreau wrote:
> >Is this really needed Adrian ? I mean, everyone reads iso-8859-1, not
> >everyone reads UTF-8.
>
> "Everyone" who speaks a Western European language, perhaps; and even
> then, mostly because a lot of tools still have a "oh, it's not valid
> UTF-8, guess iso-8859-1" mode.

Or simply because people have not migrated all their install, or have
explicitly disabled UTF-8 a few hours after starting to use it once
they discovered the mess it caused and the poor support from the
tools :-/

> The most common instance of non-ASCII
> characters in Linux kernel code are people's names, and there are plenty
> of names which aren't representable in either ASCII or iso-8859-1.
>
> The debate on this was years ago, and the consensus was to migrate to
> UTF-8; however, the salient information should be expressed in the ASCII
> character set unless impossible.

And do we really consider that people's names in *comments* cannot
be converted to pure ASCII ? I'm western european and have always
been against accents in comments (another reason to write comments
in english BTW). Unix and internet have lived without accents for
almost 30 years without anyone really bothering. And now we try to
put them everywhere (even in domain names, implying big security
issues) and it causes real annoyances. People's names have not
changed in 30 years, so I guess that the rules used during this
time to ASCII-fy the names are still usable.

> -hpa

Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/