I've been trying to use the utf8/unicode functions to make some charset
conversions. (And since your name was conveniently located at the top of
the source file ... :)
utf8_mbstowcs fails on the following byte sequence:
c3 a5 c3 a4 c3 b6
(utf8 form of "едц", or 0x00e5 0x00e4 0x00f6)
op is a __u16 *, not a __u8 * as ip, but still ip and op is advanced the
same value. I believe the following is correct since utf8_mbtowc only
writes one 2-byte character.
--- linux-2.2.7/fs/nls/nls_base.c Fri Dec 18 17:09:51 1998
+++ linux/fs/nls/nls_base.c Thu May 6 22:28:22 1999
@@ -89,7 +89,7 @@
ip++;
n--;
} else {
- op += size;
+ op++;
ip += size;
n -= size;
}
Do you agree?
I suppose something like (size >> 1) is more in line with the rest of the
else ...
/Urban
--- Urban Widmark urban@svenskatest.se Svenska Test AB +46 90 71 71 23
- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/