Re: [PATCH 14/20] VFS/namei: add 'inode' arg to put_link().
From: Al Viro
Date: Fri Apr 17 2015 - 15:09:19 EST
On Fri, Apr 17, 2015 at 05:25:36PM +0100, Al Viro wrote:
> On Mon, Mar 23, 2015 at 01:37:40PM +1100, NeilBrown wrote:
> > @@ -1669,13 +1669,14 @@ static inline int nested_symlink(struct path *path, struct nameidata *nd)
> >
> > do {
> > struct path link = *path;
> > + struct inode *inode = link.dentry->d_inode;
> > void *cookie;
> >
> > res = follow_link(&link, nd, &cookie);
> > if (res)
> > break;
> > res = walk_component(nd, path, LOOKUP_FOLLOW);
> > - put_link(nd, &link, cookie);
> > + put_link(nd, &link, inode, cookie);
> > } while (res > 0);
>
> That's really unpleasant - it means increased stack footprint in the
> recursion.
>
> Damn, maybe it's time to bite the bullet and kill the recursion completely...
>
> What do we really need to save across the recursive call?
> * how far did we get in the previous pathname
> * data needed for put_link:
> cookie
> link body
> dentry of link
> vfsmount (to pin containing fs; non-RCU) or inode (RCU)
>
> We are already saving link body in nameidata, so we could fatten that array.
> It would allow flattening link_path_walk() completely - instead of
> recursive call we would just save what needed saving and jump to the beginning
> and on exits we'd check the depth and either return or restore the saved state
> and jump back to just past the place where recursive call used to be.
> It would even save quite a bit of space in the worst case. However, it would
> blow the stack footprint in normal cases *and* blow it even worse for the
> things that need two struct nameidata instances at once (rename(), basically).
> 5 pointers instead of 1 pointer per level - extra 32 words on stack, i.e.
> extra 256 bytes on 64bit. Extra 0.5Kb of stack footprint on rename() is
> probably too much, especially since this "saved" stuff from its two nameidata
> instances will never be used at the same time...
>
> Alternatively, we could just allocate about a page worth of an array when
> the depth of nesting goes beyond 2 and put this saved stuff there - at
> 5 pointers per level it would completely dispose of the depth of nesting
> limit, giving us uniform "can't traverse more than 40 symlinks per pathname
> resolution". 40 * 5 * sizeof(pointer) is what, at most 1600 bytes? So
> even half a page would suffice for that quite comfortably...
>
> The question is whether we'll be able to avoid blowing the I-cache footprint
> of link_path_walk() to hell while doing that; it feels like we should be,
> but we'll have to see how well does that work in reality...
>
> I'll try to implement that (with your #3..#7 as the first steps) and see
> how well does it work; it's obviously the next cycle fodder, but hopefully
> in testable shape by -rc2...
Hmm... Actually, right now we have 192 bytes of stack footprint per
nesting level (amd64 allmodconfig). Which means that simply making the
array fatter would give a clean benefit at the 3rd level of recursion (symlink
encountered while traversing a symlink) for everything other than rename()...
allnoconfig+64bit gives 160 bytes per level, with the same breakeven point.
Interesting... It might even make sense to separate that array from
struct nameidata and solve the rename() problem that way (current->nameidata
would be replaced with pointer to that sucker in such variant, of course, and
->depth would move there). In that variant we do not get rid of nesting limit
completely, but it would probably be simpler than the one above...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/