Re: [RFC][PATCHSET v3] non-recursive pathname resolution & RCU symlinks

From: Linus Torvalds
Date: Sun May 17 2015 - 12:44:00 EST


On Sun, May 17, 2015 at 3:55 AM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> And that is complete crap. Multi-component lookups do make sense; once
> we are at the edge of the area present in dcache, we _know_ there won't
> be any existing mountpoints involved; parsing the components and feeding
> them to fs at once, along with an array of dentries to fill makes perfect
> sense. Why bother with a bunch of roundtrips when we can have one?

Yes, the edges are easier. And yes, it's fine to do components one by one.

Maybe I misunderstood, but I thought that was exactly what Neil
*didn't* want to do, though. It sounded like he wanted to do
path-based lookup, not component-based one.

But yes, if it's purely about preloading the cache, then *that* should
be reasonably easy. In fact, it should work as-is today, if we just
added a "const char *hint" to the lookup callback which told the
filesystem what will come after this lookup. But it would be a hint
for pre-loading the dcache, nothing more.

So if we have a pathname like "a/b/c" that we don't have in the
dcache, and we're doing to look up component "a", we could give "b/c"
as the hint, and a filesystem that currently populates the dcache with
"a" by doing

d_instantiate(dentry, inode);

could decide that *before* it does that "d_instantiate()", it could
pre-populate the child list of 'dentry' with the lookup information
for 'b' (and possibly recursively for 'c' too under 'd').

But you'd still have to do the components one by one, you couldn't
just do the "final" tip.

And no, I absolutely refuse to even entertain the thought of the
filesystem actually doing any of the do_last crap. It would bt purely
about pre-populating the dcache deeper than the one single component,
and then the VFS layer would just find the pre-populated dentries and
do the normal thing.

Doing things that way means that not only does do_last() at the vfs
level already do the right thing, but we get all the per-component
semantics (with security checks etc) right, because we'd still be
traversing the pathname one component at a time. It's just the
filesystem that could prime the cache.

If *that* was what Neil wanted to do (rather than do "a/b/c" as one
single lookup to the server), then I withdraw all my complaints and am
sorry for having misunderstood.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/