Re: [PATCH 14/20] VFS/namei: add 'inode' arg to put_link().

From: Al Viro
Date: Fri Apr 17 2015 - 12:25:44 EST


On Mon, Mar 23, 2015 at 01:37:40PM +1100, NeilBrown wrote:
> @@ -1669,13 +1669,14 @@ static inline int nested_symlink(struct path *path, struct nameidata *nd)
>
> do {
> struct path link = *path;
> + struct inode *inode = link.dentry->d_inode;
> void *cookie;
>
> res = follow_link(&link, nd, &cookie);
> if (res)
> break;
> res = walk_component(nd, path, LOOKUP_FOLLOW);
> - put_link(nd, &link, cookie);
> + put_link(nd, &link, inode, cookie);
> } while (res > 0);

That's really unpleasant - it means increased stack footprint in the
recursion.

Damn, maybe it's time to bite the bullet and kill the recursion completely...

What do we really need to save across the recursive call?
* how far did we get in the previous pathname
* data needed for put_link:
cookie
link body
dentry of link
vfsmount (to pin containing fs; non-RCU) or inode (RCU)

We are already saving link body in nameidata, so we could fatten that array.
It would allow flattening link_path_walk() completely - instead of
recursive call we would just save what needed saving and jump to the beginning
and on exits we'd check the depth and either return or restore the saved state
and jump back to just past the place where recursive call used to be.
It would even save quite a bit of space in the worst case. However, it would
blow the stack footprint in normal cases *and* blow it even worse for the
things that need two struct nameidata instances at once (rename(), basically).
5 pointers instead of 1 pointer per level - extra 32 words on stack, i.e.
extra 256 bytes on 64bit. Extra 0.5Kb of stack footprint on rename() is
probably too much, especially since this "saved" stuff from its two nameidata
instances will never be used at the same time...

Alternatively, we could just allocate about a page worth of an array when
the depth of nesting goes beyond 2 and put this saved stuff there - at
5 pointers per level it would completely dispose of the depth of nesting
limit, giving us uniform "can't traverse more than 40 symlinks per pathname
resolution". 40 * 5 * sizeof(pointer) is what, at most 1600 bytes? So
even half a page would suffice for that quite comfortably...

The question is whether we'll be able to avoid blowing the I-cache footprint
of link_path_walk() to hell while doing that; it feels like we should be,
but we'll have to see how well does that work in reality...

I'll try to implement that (with your #3..#7 as the first steps) and see
how well does it work; it's obviously the next cycle fodder, but hopefully
in testable shape by -rc2...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/