Re: [Patch] Support UTF-8 scripts

From: H. Peter Anvin
Date: Fri Sep 16 2005 - 17:09:25 EST


Martin v. Löwis wrote:
In programming languages that support the notion of source encodings,
you do have markers for 8-bit encodings. For example, in Python, you
can specify

# -*- coding: iso-8859-1 -*-

to denote the source encoding. In Perl, you write

use encoding "latin-1";

(with 'use utf8;' being a special-case shortcut).

In Java, you can specify the encoding through the -encoding argument
to javac. In gcc, you use -finput-charset (with the special case of
-fexec-charset and -fwide-exec-charset potentially being different).

So you *must* use encoding declarations in some languages; the UTF-8
signature is a particularly convenient way of doing so, since it allows
for uniformity across languages, with no need for the text editors to
parse all the different programming languages.

Did you miss the point? There has been a standard for marking for *30 years*, and virtually NOONE (outside Japan) uses it.

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/