Re: [PATCH] namei: results of d_is_negative() should be checked after dentry revalidation

From: Al Viro
Date: Sat Oct 10 2015 - 13:13:28 EST


On Sat, Oct 10, 2015 at 02:36:57AM +0100, Al Viro wrote:
> On Fri, Oct 09, 2015 at 05:19:02PM -0700, Linus Torvalds wrote:
>
> > So in general, we should always (a) either verify all sequence points
> > or (b) return -ENOCHLD to go into slow mode. The patch seems
> >
> > However, this thing was explicitly made to be this way by commit
> > 766c4cbfacd8 ("namei: d_is_negative() should be checked before ->d_seq
> > validation"), so while my gut feel is to consider this fix
> > ObviouslyCorrect(tm), I will delay it a bit in the hope to get an ACK
> > and comment from Al about the patch.
> >
> > Al?
>
> Umm... I agree that the current version is wrong and it looks like this
> patch is a complete fix. The only problem is the commit message -
> what really happens is that 766c4cbfacd8 got the things subtly wrong.
> We used to treat d_is_negative() after lookup_fast() as "fall with ENOENT".
> That was wrong - checking ->d_flags outside of ->d_seq protection is
> unreliable and failing with hard error on what should've fallen back to
> non-RCU pathname resolution is a bug.
>
> Unfortunately, we'd pulled the test too far up and ran afoul of another
> kind of staleness. Dentry might have been absolutely stable from the
> RCU point of view (and we might be on UP, etc.), but stale from the
> remote fs point of view. If ->d_revalidate() returns "it's actually
> stale", dentry gets thrown away and original code wouldn't even have looked
> at its ->d_flags. What we need is to check ->d_flags where 766c4cbfacd8 does
> (prior to ->d_seq validation) but only use the result in cases where we
> do not discard this dentry outright.
>
> With some explanation along the lines of the above added, consider the patch
> ACKed.

OK, I've attemtped to add an explanation of what's going on; please, pull from
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus

Shortlog:
Trond Myklebust (1):
namei: results of d_is_negative() should be acted upon only after dentry revalidation

Diffstat:
fs/namei.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/