Re: fs/dcache.c - BUG: soft lockup - CPU#5 stuck for 22s! [systemd-udevd:1667]

From: Al Viro
Date: Thu May 29 2014 - 11:45:04 EST


On Thu, May 29, 2014 at 08:10:57AM -0700, Linus Torvalds wrote:
> If so, though, that brings up two questions:
>
> (a) do we really want to be that aggressive? Can we ever traverse
> _past_ the point we're actually trying to shrink in
> shrink_dcache_parent()?

Caller of shrink_dcache_parent() would better hold a reference to the
argument, or it might get freed right under us ;-) So no, we can't
go past that point - the subtree root will stay busy.

The reason we want to be aggressive there is to avoid excessive iterations -
think what happens e.g. if we have a chain of N dentries, with nothing pinning
them (i.e. the last one has refcount 0, the first - 2, everything else - 1).
Simply doing dput() would result in O(N^2) vs. O(N)...

> (b) why does the "dput()" (or rather, the dentry_kill()) locking
> logic have to retain the old trylock case rather than share the parent
> locking logic?
>
> I'm assuming the answer to (b) is that we can't afford to drop the
> dentry lock in dentry_kill(), but I'd like that answer to the "Why" to
> be documented somewhere.

We actually might be able to do it that way (rechecking ->d_count after
lock_parent()), but I would really prefer to leave that until after -final.
I want to get profiling data from that first - dput() is a much hotter path
than shrink_dcache_parent() and friends...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/