I started a thread a while ago (2.6.3/2.6.4) where I submitted some
patches to UTF-8ifying the kernel sources. Basically, most of the
kernel is ASCII (98.4% of the files). The rest are mostly ISO-Latin-1,
with the rare bit of Japanese (in a couple of charsets) and some just
random bytes in some of the Documentation/...
http://www.yak.net/random/linux-2.6.4-utf8-cleanup-auto.diffA lot of names and some art supposed to be ASCII.
http://www.yak.net/random/linux-2.6.4-utf8-cleanup-cstrings2utf8.diffSome degree symbols and microseconds... and names.
http://www.yak.net/random/linux-2.6.4-utf8-cleanup-jp.diffOk, this Japanese is only in the comments.
http://www.yak.net/random/linux-2.6.4-utf8-cleanup-wrong.diffThere are a few microseconds written properly, but may commonly by typed as us, or just don't use abbr.
It's sorta difficult to do non-ASCII patches over email becauseTotally agree, although I use Mozilla Mail (and sometimes mutt).
the kernel developers like reading their mail in mutt, and don't like attachments (the only sane ways to send non 7-bit clean data:
8-bit MIME: tagged and bagged or uuencoded)
Further, you confuse the hell out of vi if you have any trash (8bit data
in another charset) in a file that's supposed to be UTF-8. i.e. don't
think you're going to be able to look at a charset changing patch in
anything.