Re: [PATCH] mm: vmscan: count only dirty pages as congested

From: Jamie Liu
Date: Wed Oct 15 2014 - 19:07:47 EST


wait_iff_congested() only waits if ZONE_CONGESTED is set (and at least
one BDI is still congested). Modulo concurrent changes to BDI
congestion status:

After this change, the probability that a given shrink_inactive_list()
sets ZONE_CONGESTED increases monotonically with the fraction of dirty
pages on the LRU, to 100% if all dirty pages are backed by a
write-congested BDI. This is in line with what appears to intended,
judging by the comment:

/*
* Tag a zone as congested if all the dirty pages scanned were
* backed by a congested BDI and wait_iff_congested will stall.
*/
if (nr_dirty && nr_dirty == nr_congested)
set_bit(ZONE_CONGESTED, &zone->flags);

Before this change, the probability that a given
shrink_inactive_list() sets ZONE_CONGESTED varies erratically. Because
the ZONE_CONGESTED condition is nr_dirty && nr_dirty == nr_congested,
the probability peaks when the fraction of dirty pages is equal to the
fraction of file pages backed by congested BDIs. So under some
circumstances, an increase in the fraction of dirty pages or in the
fraction of congested pages can actually result in an *decreased*
probability that reclaim will stall for writeback congestion, and vice
versa; which is both counterintuitive and counterproductive.

On Wed, Oct 15, 2014 at 1:05 PM, Andrew Morton
<akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Wed, 15 Oct 2014 12:58:35 -0700 Jamie Liu <jamieliu@xxxxxxxxxx> wrote:
>
>> shrink_page_list() counts all pages with a mapping, including clean
>> pages, toward nr_congested if they're on a write-congested BDI.
>> shrink_inactive_list() then sets ZONE_CONGESTED if nr_dirty ==
>> nr_congested. Fix this apples-to-oranges comparison by only counting
>> pages for nr_congested if they count for nr_dirty.
>>
>> ...
>>
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -875,7 +875,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>> * end of the LRU a second time.
>> */
>> mapping = page_mapping(page);
>> - if ((mapping && bdi_write_congested(mapping->backing_dev_info)) ||
>> + if (((dirty || writeback) && mapping &&
>> + bdi_write_congested(mapping->backing_dev_info)) ||
>> (writeback && PageReclaim(page)))
>> nr_congested++;
>
> What are the observed runtime effects of this change?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/