Re: [PATCH v2] checkpatch: Only encode UTF-8 quoted printable mail headers

From: Joe Perches
Date: Thu Jul 19 2018 - 13:38:52 EST


On Thu, 2018-07-19 at 17:03 +0200, Geert Uytterhoeven wrote:
> Hi Arnd,
>
> On Thu, Jul 19, 2018 at 4:50 PM Arnd Bergmann <arnd@xxxxxxxx> wrote:
> > On a related note, I've looked through all files in the kernel, and found
> > that very file files in there are something other than 7-bit ASCII, UTF-8
> > or non-text files (according to /usr/bin/file). These are the only ones I found:
> >
> > Documentation/devicetree/bindings/net/nfc/pn544.txt: ISO-8859 text
> > arch/arm/boot/dts/sun4i-a10-inet97fv2.dts: C source, ISO-8859 text
> > arch/arm/crypto/sha256_glue.c: C source, ISO-8859 text
> > arch/arm/crypto/sha256_neon_glue.c: C source, ISO-8859 text
> > arch/m68k/hp300/hp300map.map: ISO-8859 text
> > arch/s390/kernel/ebcdic.c: C source, Non-ISO
> > extended-ASCII text
> > drivers/crypto/vmx/ghashp8-ppc.pl: a /usr/bin/env
> > perl script, ISO-8859 text executable
> > drivers/iio/dac/ltc2632.c: C source, ISO-8859 text
> > drivers/power/reset/ltc2952-poweroff.c: C source, ISO-8859 text
> > drivers/staging/rtl8188eu/include/odm.h: C source, ISO-8859 text
> > drivers/tty/vt/defkeymap.map: ISO-8859 text
> > kernel/events/callchain.c: C source, ISO-8859 text
> > lib/fonts/font_7x14.c: data
> > lib/fonts/font_8x16.c: data
> > lib/fonts/font_8x8.c: data
> > lib/fonts/font_pearl_8x8.c: data
> > net/netfilter/ipvs/Kconfig: ISO-8859 text
> > net/netfilter/ipvs/ip_vs_mh.c: C source, ISO-8859 text
> > tools/power/cpupower/po/de.po: GNU gettext
> > message catalogue, ISO-8859 text
> > tools/power/cpupower/po/fr.po: GNU gettext
> > message catalogue, ISO-8859 text
> >
> > Almost all of those can be trivially converted using 'recode ISO-8859-1..UTF-8',
> > which we should probably do. The four font files contain comments for each
> > of the 256 characters, so that recode turns e.g. the <FF> character
> > into <U+00FF>,
> > which is probably still what we want here.
> >
> > The one exception seems to be arch/s390/kernel/ebcdic.c, which apparently
> > uses 0x81 bytes as an excape before characters ISO-8859-1 characters with
> > the high bit set. I don't know what that encoding is called, but I managed
> > to manually convert it into something useful.
>
> Yes, we should convert everything to UTF-8.

Thanks.

Can you send a patch or a script for Linus to apply?