Re: [PATCH] vfs: fix race in rcu lookup of pruned dentry

From: Al Viro
Date: Sun Jul 17 2011 - 20:25:38 EST


On Sun, Jul 17, 2011 at 04:47:45PM -0700, Hugh Dickins wrote:
> On Sun, 17 Jul 2011, Linus Torvalds wrote:
> > On Sun, Jul 17, 2011 at 4:16 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > OR
> > >
> > > ?- keep part of the patch from Hugh, treating negative in RCU mode as
> > > "need to unlazy".
> >
> > No, urgh, that's horrible.
> >
> > Not being able to do an RCU lookup of negative dentries would be
> > really sad. There are some loads where a negative dentry is the
> > *common* case.
>
> Yes, that worried me too.

Grr... What do you think we need to do when we find a negative dentry in
RCU lookup? We can
* check its ->d_seq, to see if it's valid
* somehow (e.g. with what Linus suggested) try to walk further and
leave staleness checks for dentries we would encounter later in the walk.

If it's in the end of pathname, there is no choice at all - we need to
check ->d_seq of this one. Right? And we do exactly that. If we see
that it's for real, fine we got ourselves a reference to negative dentry
and can go ahead. If it's something like stat(2), we'll just do dput()
and bugger off. Refcounts of everything on the path to it are unaffected,
no global locks played with...

If it's in the middle of pathname, we could try to delay the check. And
what would it buy us? If that dentry was stale, we'd just walk a bit
deeper. And found a stale one at some later point. And restarted pathname
resolution in non-lazy mode from the very beginning. If it wasn't stale
(i.e. real negative dentry), we *do* want -ENOENT and we'll get it as soon
as unlazy_walk() will check ->d_seq and we plod through the rest of
do_lookup() to check in walk_component(). We won't redo d_lookup(), lock
i_mutex, etc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/