Re: [PATCH] vmscan: remove wait_on_page_writeback() from pageout()

From: Wu Fengguang
Date: Wed Jul 28 2010 - 05:30:44 EST

Next message: Richard Kennedy: "[PATCH] random: reorder struct entropy_store to remove padding on64bits"
Previous message: Benjamin Herrenschmidt: "Re: [PATCH 28/31] memblock: Export MEMBLOCK_ERROR again"
In reply to: Mel Gorman: "Re: [PATCH] vmscan: remove wait_on_page_writeback() from pageout()"
Next in thread: Mel Gorman: "Re: [PATCH] vmscan: remove wait_on_page_writeback() from pageout()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Jul 28, 2010 at 05:10:33PM +0800, Mel Gorman wrote:
> On Wed, Jul 28, 2010 at 04:46:54PM +0800, Wu Fengguang wrote:
> > The wait_on_page_writeback() call inside pageout() is virtually dead code.
> >
> > shrink_inactive_list()
> > shrink_page_list(PAGEOUT_IO_ASYNC)
> > pageout(PAGEOUT_IO_ASYNC)
> > shrink_page_list(PAGEOUT_IO_SYNC)
> > pageout(PAGEOUT_IO_SYNC)
> >
> > Because shrink_page_list/pageout(PAGEOUT_IO_SYNC) is always called after
> > a preceding shrink_page_list/pageout(PAGEOUT_IO_ASYNC), the first
> > pageout(ASYNC) converts dirty pages into writeback pages, the second
> > shrink_page_list(SYNC) waits on the clean of writeback pages before
> > calling pageout(SYNC). The second shrink_page_list(SYNC) can hardly run
> > into dirty pages for pageout(SYNC) unless in some race conditions.
> >
>
> It's possible for the second call to run into dirty pages as there is a
> congestion_wait() call between the first shrink_page_list() call and the
> second. That's a big window.

OK there is a <=0.1s time window. Then what about the data set size?
After first shrink_page_list(ASYNC), there will be hardly any pages
left in the page_list except for the already under-writeback pages and
other unreclaimable pages. So it still asks for some race conditions
for hitting the second pageout(SYNC) -- some unreclaimable pages
become reclaimable+dirty in the 0.1s time window.

> > And the wait page-by-page behavior of pageout(SYNC) will lead to very
> > long stall time if running into some range of dirty pages.
>
> True, but this is also lumpy reclaim which is depending on a contiguous
> range of pages. It's better for it to wait on the selected range of pages
> which is known to contain at least one old page than excessively scan and
> reclaim newer pages.
>
> > So it's bad
> > idea anyway to call wait_on_page_writeback() inside pageout().
> >
>
> I recognise that you are probably thinking of the stall-due-to-fork problem
> but I'd expect the patch that raises the bar for <= PAGE_ALLOC_COSTLY_ORDER
> to be sufficient. If not, I think it still makes sense to call
> wait_on_page_writeback() for > PAGE_ALLOC_COSTLY_ORDER.

The main intention of this patch is to remove semi-dead code.
I'm less disturbed by the long stall time now with the previous patch ;)

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Richard Kennedy: "[PATCH] random: reorder struct entropy_store to remove padding on64bits"
Previous message: Benjamin Herrenschmidt: "Re: [PATCH 28/31] memblock: Export MEMBLOCK_ERROR again"
In reply to: Mel Gorman: "Re: [PATCH] vmscan: remove wait_on_page_writeback() from pageout()"
Next in thread: Mel Gorman: "Re: [PATCH] vmscan: remove wait_on_page_writeback() from pageout()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]