Re: [PATCH] fs-writeback: drop wb->list_lock during blk_finish_plug()

From: Chris Mason
Date: Fri Sep 11 2015 - 19:16:52 EST


On Fri, Sep 11, 2015 at 03:06:18PM -0700, Linus Torvalds wrote:
> On Fri, Sep 11, 2015 at 2:04 PM, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > Of course, actual numbers would be the deciding factor if this really
> > is noticeable. But "cleaner code and saner locking" is definitely an
> > issue at least for me.
>
> Anyway, I'll hold off pushing out the revert too for a while, in the
> hope that we'll have actual numbers for or against whatever the
> particular solution should be.
>
> I do get the feeling that the whole wb->list_lock needs more loving.
> For example, there's that locked_inode_to_wb_and_lock_list() thing
> (which might be better off trying to just do "trylock" on it, but
> that's a separate issue) showing lock inversion worries.

At the very least, it's kind of sad how many of us were surprised to
find the lock held when Dave's patch was unplugging. If this bug
slipped through, more are going to. It's also true N-1 of those people
were really surprised about scheduling unplug functions, so maybe we
can't put all the blame on wb->list_lock.

>
> Maybe we should *not* get that wb->list_lock early at all, and nest it
> inside the inode spinlocks, and just move the list_lock locking down a
> lot (ie not even try to hold it over big functions that then are
> forced to releasing it anyway).
>
> For example, realistically, it looks like the longest we ever *really*
> hold that lock is at the top of the loop of writeback_sb_inodes() -
> maybe we could just explicitly have a function that does "find the
> first inode that matches this sb and needs writeout activity", and
> literally only take hold the lock over that function. And *not* take
> the lock in the caller at all?
>
> The callers seem to want that lock mainly because they do that
> "list_empty(&wb->b_io)" test. But the way our doubly linked lists
> work, "list_empty()" is actually something that can be done racily
> without the lock held...
>
> That said, I doubt anybody really wants to touch this, so at least for
> now we're stuck with either the "plug outside the lock" or the "drop
> and retake lock" options. It really would be loveyl to haev numbers
> either way.

For 4.3 timeframes, what runs do you want to see numbers for:

1) revert
2) my hack
3) plug over multiple sbs (on different devices)
4) ?

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/