Re: [PATCH] fs-writeback: drop wb->list_lock during blk_finish_plug()

From: Linus Torvalds
Date: Fri Sep 11 2015 - 18:06:24 EST


On Fri, Sep 11, 2015 at 2:04 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> Of course, actual numbers would be the deciding factor if this really
> is noticeable. But "cleaner code and saner locking" is definitely an
> issue at least for me.

Anyway, I'll hold off pushing out the revert too for a while, in the
hope that we'll have actual numbers for or against whatever the
particular solution should be.

I do get the feeling that the whole wb->list_lock needs more loving.
For example, there's that locked_inode_to_wb_and_lock_list() thing
(which might be better off trying to just do "trylock" on it, but
that's a separate issue) showing lock inversion worries.

Maybe we should *not* get that wb->list_lock early at all, and nest it
inside the inode spinlocks, and just move the list_lock locking down a
lot (ie not even try to hold it over big functions that then are
forced to releasing it anyway).

For example, realistically, it looks like the longest we ever *really*
hold that lock is at the top of the loop of writeback_sb_inodes() -
maybe we could just explicitly have a function that does "find the
first inode that matches this sb and needs writeout activity", and
literally only take hold the lock over that function. And *not* take
the lock in the caller at all?

The callers seem to want that lock mainly because they do that
"list_empty(&wb->b_io)" test. But the way our doubly linked lists
work, "list_empty()" is actually something that can be done racily
without the lock held...

That said, I doubt anybody really wants to touch this, so at least for
now we're stuck with either the "plug outside the lock" or the "drop
and retake lock" options. It really would be loveyl to haev numbers
either way.

<insert pricess leia gif>
"Dave, you're our only hope"

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/