Re: [PATCH] vfs: fix race in rcu lookup of pruned dentry

From: Linus Torvalds
Date: Mon Jul 18 2011 - 15:34:28 EST


On Mon, Jul 18, 2011 at 12:04 PM, Hugh Dickins <hughd@xxxxxxxxxx> wrote:
>
> Al has commented on that.  I'd feel more confident with a patch like
> yours, and corrected final check in mine (if we used mine at all)
>        if (!read_seqcount_retry(&path->dentry->d_seq, nd->seq);
> but ignore me, I'm easily confused by mounts ;)

No, I do think I want to do it. So for post-3.0 (likely today ;), my
current plan is to have three commits in this area:

- the one-liner removal of d_inode = NULL in d_kill (but I'll add a comment)

- the patch to properly do that dentry sequence number switch-over
across the mount-point traversal

- And possibly *some* version of your patch.

However, I'm actually looking at __d_lookup_rcu(), and it does the proper

if (read_seqcount_retry(&dentry->d_seq, *seq))
goto seqretry;

*after* having pulled the inode pointer out of "dentry->d_inode". So
I'm not actually seeing how the inode pointer could be stale at all
right now, because we do seem to have the sequence number check there!

Which just makes me suspect that it's *another* case of bug in the
dentry_kill() sequence. I think the dentry_rcuwalk_barrier() thing is
crap: it doesn't *cover* the d_inode change with the write lock, it
just puts that "barrier" there, which is bogus. You can get a
successful read of the changed inode without ever failing the read
lock sequence!

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/