Re: [PATCH 6/7] writeback: separate starting of sync vsopportunistic writeback

From: Jens Axboe
Date: Tue Sep 15 2009 - 09:04:30 EST


On Tue, Sep 15 2009, Jan Kara wrote:
> On Tue 15-09-09 13:44:26, Jens Axboe wrote:
> > On Tue, Sep 15 2009, Jens Axboe wrote:
> > > On Tue, Sep 15 2009, Jan Kara wrote:
> > > > On Mon 14-09-09 21:42:43, Jens Axboe wrote:
> > > > > On Mon, Sep 14 2009, Jens Axboe wrote:
> > > > > > On Mon, Sep 14 2009, Christoph Hellwig wrote:
> > > > > > > On Mon, Sep 14, 2009 at 03:33:07PM +0200, Jan Kara wrote:
> > > > > > > > On Mon 14-09-09 11:36:33, Jens Axboe wrote:
> > > > > > > > > bdi_start_writeback() is currently split into two paths, one for
> > > > > > > > > WB_SYNC_NONE and one for WB_SYNC_ALL. Add bdi_sync_writeback()
> > > > > > > > > for WB_SYNC_ALL writeback and let bdi_start_writeback() handle
> > > > > > > > > only WB_SYNC_NONE.
> > > > > > > > What I don't like about this patch is that if somebody sets up
> > > > > > > > writeback_control with WB_SYNC_ALL mode set and then submits it to disk via
> > > > > > > > bdi_start_writeback() it will just silently convert his writeback to an
> > > > > > > > asynchronous one.
> > > > > > > > So I'd maybe leave setting of sync_mode to the caller and just WARN_ON if
> > > > > > > > it does not match the purpose of the function...
> > > > > > >
> > > > > > > Or initialize the wb entirely inside these functions. For the sync case
> > > > > > > we really only need a superblock as argument, and for writeback it's
> > > > > > > bdi + nr_pages. And also make sure they consistenly return void as
> > > > > > > no one cares about the return value.
> > > > > >
> > > > > > Yes, I thought about doing that and like that better than the warning.
> > > > > > Just pass in the needed args and allocate+fill the wbc on stack. I'll
> > > > > > make that change.
> > > > >
> > > > > That works out much better, imho:
> > > > >
> > > > > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=270c12655d7d11e234d335a8ab0540c02c034b66
> > > > Yeah, the code looks better. BTW, how about converting also
> > > > bdi_writeback_all() to get superblock and nr_pages as an argument?
> > > > Currently it seems to be the only place "above" flusher thread which uses
> > > > wbc and it's just constructed in the callers of bdi_writeback_all() and
> > > > then disassembled inside the function...
> > >
> > > Yes good point, I'll include that too. Thanks!
> >
> > One small problem there, though... Currently all queued writeback is
> > range cyclic, however with this change then we drop that bit from
> > sync_inodes_sb() which isn't range_cyclic and instead just specifies the
> > whole range.
> I'm not sure I understand your comment but I see a problem that even
> writeback queued from sync_inodes_sb() is processed by wb_writeback() which
> sets range_cyclic. That's a bug even in your old patchset.

Indeed, looks like I have to double check all that again.

> Let's have a look at the flags in wbc:
> nonblocking - Currently only set by direct callers of ->writepage() BUT
> originally wb_kupdate() and background_writeout() also
> set this flag. Since filesystems and write_cache_pages()
> use the flag we should set it for equivalent writeouts as
> well. This should be fixed...

Since this is all handled by the dedicated thread now, dropping the
nonblocking bit was on purpose. What would the point be, except for
stopping pdflush being blocked on request allocation?

> encountered_congestion - Checked only by AFS, probably we can get rid of
> it now.
> for_kupdate - Used, we set it properly in wb_writeback() so that is fine.
> for_reclaim - Used by direct callers of ->writepage(). OK.
> for_writepages - Only set. Get rid of it.

Hmm indeed, that predates the patchset though. But I can queue up a
removal.

> range_cyclic - Used. Set also when a caller didn't want it - should be
> fixed.

Yes, that one wants fixing. Will probably have to pass that in through
the 'work' structure.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/