Re: [PATCH v2] Convert properly UTF-8 to UTF-16

From: Steve French
Date: Mon Oct 08 2012 - 10:04:13 EST


On Mon, Oct 8, 2012 at 3:18 AM, Frediano Ziglio
<frediano.ziglio@xxxxxxxxxx> wrote:
> On Wed, 2012-10-03 at 14:49 -0500, Steve French wrote:
>> Merged - but doesn't the reverse also have to be added in cifs_from_utf16? ie
>>
>> utf16s_to_utf8s(uni, ... );
>>
>
> Not strictly necessary, at least to be able to mount shares.
>
>> I am glad that someone added these multiword handling routines into
>> the kernel for FAT - this has been something we have wanted for a long
>> time in cifs (and smb2/smb3). Note the comment in
>> fs/cifs/cifs_unicode.c
>>
>> / * Note that some windows versions actually send multiword UTF-16 characters
>> * instead of straight UTF16-2. The linux nls routines however aren't able to
>> * deal with those characters properly. In the event that we get some of
>> * those characters, they won't be translated properly.
>> */
>> int
>> cifs_from_utf16(char *to, const __le16 *from, int tolen, int fromlen,
>> const struct nls_table *codepage, bool mapchar)
>>
>
> Should not be UCS-2 instead of UTF16-2 ??

Yes, UTF-16 should be used to indicate the change to UCS-2 to allow
4 byte encoding of some characters. Currently with your patch we
have partial support for UTF-16 in cifs.ko, but most cifs (and smb2/smb3)
servers presumably support UTF-16 now on the wire.

>> We could really use some nls test cases for cifs/smb2/smb3/nfs4 which
>> basically did various file, directory, symlink create/rename/delete
>> operations with various hard to map characters so we can test copying
>> to and from the server and ensure that we get the name mappings right
>> for these (and don't ever regress). Fortunately smb2/smb3 is only
>> unicode so we don't have to deal with mappings to other codepages from
>> utf8
>>
>
> Do you have some framework/hook to put these tests ?
>
> Where did you merge ? I cannot find nothing at
> http://gitweb.samba.org/?p=sfrench/cifs-2.6.git;a=summary

It is in the for-next (and for-linus) branch.

http://gitweb.samba.org/?p=sfrench/cifs-2.6.git;a=shortlog;h=refs/heads/for-next




--
Thanks,

Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/