Re: [PATCH] fs-writeback: drop wb->list_lock during blk_finish_plug()

From: Chris Mason
Date: Thu Sep 17 2015 - 18:42:48 EST


On Thu, Sep 17, 2015 at 12:39:51PM -0700, Linus Torvalds wrote:
> On Wed, Sep 16, 2015 at 7:14 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> >>
> >> Dave, if you're testing my current -git, the other performance issue
> >> might still be the spinlock thing.
> >
> > I have the fix as the first commit in my local tree - it'll remain
> > there until I get a conflict after an update. :)
>
> Ok. I'm happy to report that you should get a conflict now, and that
> the spinlock code should work well for your virtualized case again.
>
> No updates on the plugging thing yet, I'll wait a bit and follow this
> thread and see if somebody comes up with any explanations or theories
> in the hope that we might not need to revert (or at least have a more
> targeted change).

Playing around with the plug a little, most of the unplugs are coming
from the cond_resched_lock(). Not really sure why we are doing the
cond_resched() there, we should be doing it before we retake the lock
instead.

This patch takes my box (with dirty thresholds at 1.5GB/3GB) from 195K
files/sec up to 213K. Average IO size is the same as 4.3-rc1.

It probably won't help Dave, since most of his unplugs should have been
from the cond_resched_locked() too.

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 587ac08..05ed541 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1481,6 +1481,19 @@ static long writeback_sb_inodes(struct super_block *sb,
wbc_detach_inode(&wbc);
work->nr_pages -= write_chunk - wbc.nr_to_write;
wrote += write_chunk - wbc.nr_to_write;
+
+ if (need_resched()) {
+ /*
+ * we're plugged and don't want to hand off to kblockd
+ * for the actual unplug work. But we do want to
+ * reschedule. So flush our plug and then
+ * schedule away
+ */
+ blk_flush_plug(current);
+ cond_resched();
+ }
+
+
spin_lock(&wb->list_lock);
spin_lock(&inode->i_lock);
if (!(inode->i_state & I_DIRTY_ALL))
@@ -1488,7 +1501,7 @@ static long writeback_sb_inodes(struct super_block *sb,
requeue_inode(inode, wb, &wbc);
inode_sync_complete(inode);
spin_unlock(&inode->i_lock);
- cond_resched_lock(&wb->list_lock);
+
/*
* bail out to wb_writeback() often enough to check
* background threshold and other termination conditions.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/