Re: frequent softlockups with 3.10rc6.

From: Dave Chinner
Date: Sat Jun 29 2013 - 22:06:01 EST


On Sat, Jun 29, 2013 at 03:23:48PM -0700, Linus Torvalds wrote:
> On Sat, Jun 29, 2013 at 1:13 PM, Dave Jones <davej@xxxxxxxxxx> wrote:
> >
> > So with that patch, those two boxes have now been fuzzing away for
> > over 24hrs without seeing that specific sync related bug.
>
> Ok, so at least that confirms that yes, the problem is the excessive
> contention on inode_sb_list_lock.
>
> Ugh. There's no way we can do that patch by DaveC for 3.10. Not only
> is it scary, Andi pointed out that it's actively buggy and will miss
> inodes that need writeback due to moving things to private lists.

Right - it was just a quick hack for proof of concept... :)

> So I suspect we'll have to do 3.10 with this starvation issue in
> place, and mark for stable backporting whatever eventual fix we find.

I can reproduce the contention problem on both 3.8 and 3.9 kernels,
so this isn't a recent regression, and as such it's likely I'll be
able to reproduce it on any kernel since the global inode_lock
breakup was done back in 2.6.38.

Hence I don't think there is significant urgency to fix it 3.10.
I'll have a bit more of a think about how to address this, because
we really need to make the inode_sb_list_lock disappear from the
create/unlink paths as well.

There are several "walk all cached inodes on the superblock"
algorithms in the kernel that also need fixing, too. Hence
I'm tempted just to turn this list into another list_lru (even
though we would't use the LRU capabilities of the interface) and ue
the list walk interface it has to hide the fact it is actually
using per-node lists and locks...

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/