Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

From: Linus Torvalds
Date: Thu May 14 2015 - 20:25:53 EST

Next message: Stephen Boyd: "Re: [PATCH v4 3/5] clk: hi6220: Document devicetree bindings for hi6220 clock"
Previous message: Stephen Boyd: "Re: [PATCH v4 4/5] clk: hi6220: Clock driver support for Hisilicon hi6220 SoC"
In reply to: Al Viro: "Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks"
Next in thread: Al Viro: "Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, May 14, 2015 at 4:36 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> On Thu, May 14, 2015 at 04:24:13PM -0700, Linus Torvalds wrote:
>
>> So ASCII-only case-insensitivity is sufficient for you guys?
>>
>> Doing case-insensitive lookups at a vfs layer level wouldn't be
>> impossible (add some new lookup flag, so it would *not* be
>> per-filesystem, it would be per-operation!),
>
> ENOPARSE. Either two names are equivalent or they are not; it's not a
> per-operation thing. What do you mean?

We can easily make things per-operation, by adding another flag. We
already have per-operation flags like LOOKUP_FOLLOW, which decides if
we follow the last symlink or not. We could add a LOOKUP_ICASE, which
decides whether we compare case or not. Obviously, we'd have to ad the
proper O_ICASE for open (and AT_ICASE for fstatat() and friends).
Exactly like we do for LOOKUP_FOLLOW.

HOWEVER.

The reason ASCII-only matters is two-fold:

(a) hashing needs to work, and hash all equivalent names to the same
bucket. And we need to hash the same *regardless* of whether the
operation was done with ICASE or not.

With ASCII, this is fairly easy: we could easily make the hashing
just mask bit 5 in each byte, and that wouldn't slow us down at all,
and it would hardly change the hash effectiveness either. m

In particular, with ASCII, we can trivially still do the
word-at-a-time hashing. So there's fairly little downside.

(b) The *compare* needs to work too. In particular, right now we very
much try to avoid comparing the names by checking both the full hash
and the name length. Again, that's fine with ASCII - two names that
differ in case are the same length.

And again, we can still use the word-at-a-time compare, just have
a mask (and at compare time, we can make the mask depend on ICASE).
Sure, you'll still have to do a more careful compare (becaue
case-insensitivity is not *just* "same except for bit 5 even in
ASCII), but we can trivially have a ICASE test up front, and keep the
fast case exactly the same as before.

Now, doing full UTF-8 is *much* harder. Part of it is that outside of
ASCII, you literally have cases that are ambiguous. Part of it is that
outside of ASCII, now the lengths aren't even guaranteed to match. And
part of it is that now you have to do things that are much more
complex than just masking bits in parallel for multiple bytes at the
same time (although you can still have a fast-path that depends on
just masking the high bit, to at least say "this is just the ASCII
subcase").

But doing ASCII ICASE compares wouldn't be that hard, and wouldn't
affect performance.

Btw, don't get me wrong. I'm not saying it's a great idea. I think
icase compares are stupid. Really really stupid. But samba might be
worth jumping though a few hoops for. The real problem is that even
with just ASCII, it does make it much easier to create nasty hash
collisions in the dentry hashes (same hash from 256 variations of
aAaAAaaA - just repeat the same letter in different variations of
lower/upper case).

So even plain ASCII icase has some real problems. But it's
conceptually not that hard. True UTF-8 icase? That's an absolute
*nightmare*, and causes serious problems. OS X got it very very wrong,
for example, by messing up the normalization.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Stephen Boyd: "Re: [PATCH v4 3/5] clk: hi6220: Document devicetree bindings for hi6220 clock"
Previous message: Stephen Boyd: "Re: [PATCH v4 4/5] clk: hi6220: Clock driver support for Hisilicon hi6220 SoC"
In reply to: Al Viro: "Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks"
Next in thread: Al Viro: "Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]