Re: [RFC] fhandle implementation.

From: Kai Henningsen (kaih@khms.westfalen.de)
Date: Sun Jun 25 2000 - 11:17:00 EST


neilb@cse.unsw.edu.au (Neil Brown) wrote on 21.06.00 in <14672.2317.470754.944440@notabene.cse.unsw.edu.au>:

> Just reporting facts, not necessary defending them, the very latest
> draft, draft 7, adds a new paragraph:
>
> 5.7. Character Case Attributes
>
> With respect to the case_insensitive and case_preserving attributes,
> each UCS-4 character (which UTF-8 encodes) has a "long descriptive
> name" [RFC1345] which may or may not included the word "CAPITAL" or
> "SMALL". The presence of SMALL or CAPITAL allows an NFS server to
> implement unambiguous and efficient table driven mappings for case
> insensitive comparisons, and non-case-preserving storage. For
> general character handling and internationalization issues, see the
> section "Internationalization".

Whoever wrote this has never read the Unicode standard. (Which is the
right one to consult for case-independence - AFAIK 10646 doesn't talk
about details like that.)

There *are* default case-translation tables, but

1. they do not rely on playing games with character names, and
2. they are not adjusted to locales, whereas many applications (no doubt
   including some filesystems) will want to adjust to a locale.

Furthermore,

3. when doing UCS-4, ignoring case is really not enough. People with Mac
   programming experience might recall the various comparision routines
   used there; ignoring accents is one common ingredient, and there are no
   doubt more.

If I had to write locale-independent "case ignoring" routines for UCS-4,
I'd probably start by converting both strings to normalization form KD,
then ignore all combining marks, and somewhere in there also ignore case,
possibly by converting to lower case. Yes, it's a lot of work. Details are
at <URL:http://www.unicode.org/unicode/reports/tr15/> and <URL:http://
www.unicode.org/unicode/reports/tr21/>.

If I had to write the same for locale dependence, I'd slit my wrists.

MfG Kai

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jun 26 2000 - 21:00:07 EST