Re: [PATCH] Fix dcache race during umount

From: Neil Brown
Date: Mon Apr 03 2006 - 21:04:11 EST


On Monday April 3, balbir@xxxxxxxxxx wrote:
> Hi, Neil,
>
> >
> > Cc: Jan Blunck <jblunck@xxxxxxx>
> > Cc: Kirill Korotaev <dev@xxxxxxxxxx>
> > Cc: olh@xxxxxxx
> > Cc: bsingharora@xxxxxxxxx
>
> Could you please make this balbir@xxxxxxxxxx

Sure.

>
> >
> > Signed-off-by: Neil Brown <neilb@xxxxxxx>
> >
> <snip>
> - /* If the dentry was recently referenced, don't free it. */
> - if (dentry->d_flags & DCACHE_REFERENCED) {
> - dentry->d_flags &= ~DCACHE_REFERENCED;
> - list_add(&dentry->d_lru, &dentry_unused);
> - dentry_stat.nr_unused++;
> - spin_unlock(&dentry->d_lock);
> - continue;
> + if (!(dentry->d_flags & DCACHE_REFERENCED) &&
> + (!sb || dentry->d_sb == sb)) {
>
> Comments for the condition please. Something like
>
> /*
> * If the dentry is not DCACHED_REFERENCED, it is time to move it to LRU list,
> * provided the super block is NULL (which means we are trying to reclaim memory
> * or this dentry belongs to the same super block that we want to shrink.
> */

Ok, thanks. However it isn't time to "move it to the LRU list" but
rather time to "move it from the LRU list, out of the cache all
together, and through it away".

>
> One side-effect of this check I see is
>
> Earlier, all prune_dcache() calls would prune the dentry cache. This
> condition will cause dentries belonging only those super blocks being
> shrink'ed to be freed up. shrink_dcache_memory() will have to do the
> additional work of freeing dentries (especially for file systems like
> sysfs, procfs, etc). But the good thing is it should make the per
> super block operations faster (like unmount). IMO, this is the correct
> behaviour, but I am not sure of the side-effects.
>

Hmm... yes, but there is a worse side-effect I think. If
shrink_dcache_memory finds a dentry that it cannot free, it will move
it to the head of the LRU, so unmount will not be able to find it so
easily, and will end up moving it back down to the tail. I don't
think this can livelock, but it is unpleasant.

Rather than move these entries to the head of the list, I'd like to
leave them at the tail, and try to skip over entries that we might not
be able to free.



> + if (sb) {
> + prune_one_dentry(dentry);
> + continue;
> + }
> > + /* Need to avoid race with generic_shutdown_super */
> > + if (down_read_trylock(&dentry->d_sb->s_umount) &&
> > + dentry->d_sb->s_root != NULL) {
>
> There is a probable bug here. What if down_read_trylock() succeeds and
> dentry->d_sb->s_root == NULL? We still need to do an up_read before we
> move on.

Oops, yes. Thanks for that!

> The comment would be better put as
>
> /*
> * If we are able to acquire the umount semaphore, then the super
> block cannot be unmounted
> * while we are pruning this dentry. This helps avoid a race condition
> that is caused due to
> * intermediate reference counts held by the children of the dentry in
> prune_one_dentry().
> * This leads to select_dcache_parent() ignoring those dentries,
> leaving behind non-dput
> * dentries. The unmount happens before prune_one_dentry() can dput
> the dentries.
> */

Well, I've beefed up the comment, but I would rather focus on the
prune_one_dentry holding an inode rather than holding a dentry. I
think that problem is clearer and it comes to the same thing.

Patch to follow.

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/