Re: fs/dcache.c - BUG: soft lockup - CPU#5 stuck for 22s! [systemd-udevd:1667]

From: Al Viro
Date: Mon May 26 2014 - 14:26:51 EST


On Mon, May 26, 2014 at 11:17:42AM -0700, Linus Torvalds wrote:
> On Mon, May 26, 2014 at 8:27 AM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > That's the livelock. OK.
>
> Hmm. Is there any reason we don't have some exclusion around
> "check_submounts_and_drop()"?
>
> That would seem to be the simplest way to avoid any livelock: just
> don't allow concurrent calls (we could make the lock per-filesystem or
> whatever). This whole case should all be for just exceptional cases
> anyway.
>
> We already sleep in that thing (well, "cond_resched()"), so taking a
> mutex should be fine.

What makes you think that it's another check_submounts_and_drop()?
And not, e.g., shrink_dcache_parent(). Or memory shrinkers. Or
some twit sitting in a subdirectory and doing stat(2) in a loop, for
that matter...

I really, really wonder WTF is causing that - we have spent 20-odd
seconds spinning while dentries in there were being evicted by
something. That - on sysfs, where dentry_kill() should be non-blocking
and very fast. Something very fishy is going on and I'd really like
to understand the use pattern we are seeing there.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/