Eliminating UDF iocharset!=utf8 code (Re: [PATCH 6/8] Support non-BMPcharacters in UDF)

From: Vladimir 'Ï-coder/phcoder' Serbinenko
Date: Wed May 16 2012 - 20:49:06 EST



> I've noticed another duplication in the UDF code: there
> is NLS support and separate UTF-8 support. UTF-8 is support by 2 ways
> actually: with -o utf8 and -o iocharset=utf8 which imply different
> codepaths. Specific UTF-8 support is probably slightly faster by
> avoiding calls and basically doing everything with shifts (or can be
> made so with a small patch). Should I perhaps kill one of them? Is
> iocharset!=utf8 still of any importance? I haven't seen it in ages.
> Perhaps we could keep just the performant UTF-8 support and map
> iocharset=utf8 to it and drop iocharset!=utf8? iocharset!=utf8 probably
> has no users anyway so keeping it we're likely to keep bugs and code
> duplication with no benefit.
>

Linux seems to support UTF-8-only pretty strongly: http://yarchive.net/comp/linux/utf8.html
(message from Sun, 15 Feb 2004 02:42:45 GMT).
And I completely agree.
If it's ok to kill iocharset!=utf8 I'll propose a series of 3 patches (killing iocharset!=utf8,
extending utf16toutf8/utf8toutf16 for unaligned input, changing UDF code to use common functions)


--
Regards
Vladimir 'Ï-coder/phcoder' Serbinenko

Attachment: signature.asc
Description: OpenPGP digital signature