Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

From: Al Viro
Date: Sun May 17 2015 - 06:56:56 EST


On Sat, May 16, 2015 at 09:04:34PM -0700, Linus Torvalds wrote:

> It's now about things like overlayfs etc, all those things.

Er... Bad example, that - overlayfs is _not_ fs-agnostic.

> When somebody does a lookup of a filename, it is not a "pass this
> filename to the filesystem". It very much *is* a
> component-by-component lookup. And in the *vast* majority of the
> cases, the cached lookup when you don't even get asked is absolutely
> the right thing to do, and doing anything else wouldn't just be wrong,
> it would be completely and utterly stupid.
>
> And the fact that somebody doesn't understand that, and has designed
> bad extensions to do multi-component lookup, isn't actually an
> argument against the dcache. It's just an argument for "people make
> bad intterfaces because they hack things up and don't understand
> things".

And that is complete crap. Multi-component lookups do make sense; once
we are at the edge of the area present in dcache, we _know_ there won't
be any existing mountpoints involved; parsing the components and feeding
them to fs at once, along with an array of dentries to fill makes perfect
sense. Why bother with a bunch of roundtrips when we can have one?

Another thing is that the likely situation with revalidaions is that the
damn thing had been sitting around long enough for wanting to check with
the server, only to get "it's still OK" in response. Again, when that
happens it makes perfect sense to do speculative walk for more than one
component ("if everything's valid, the next 5 components would be these
and they all are on that filesystem; hey, server - are these 5 still valid?")
and avoid pointless roundtrips.

As for Neil's point re do_last() and friends being much too convoluted - yes,
they are. And it's not a result of trying to shoehorn everything in one
model. "Just let NFS have at it" as soon as we reach do_last() won't make
things any simpler, unfortunately - we'll end up with the same convoluted
mess copied in a bunch of filesystems, each with bugs of its own.

Don't get me wrong - do_last/lookup_open/atomic_open complex is not a pretty
sight and it does need cleaning up. Some of that got done lately, but that
was only a tangential hit. Untangling the whole damn thing won't be fun,
but we'll need to do it. Not this cycle, though - we already have more than
enough patches piled in fs/namei.c as it is...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/