Re: [PATCH] mm: page_alloc: avoid wakeup kswapd on the unintended node

From: Johannes Weiner
Date: Fri Aug 29 2014 - 09:09:44 EST

On Fri, Aug 29, 2014 at 03:03:19PM +0800, Weijie Yang wrote:
> When enter page_alloc slowpath, we wakeup kswapd on every pgdat
> according to the zonelist and high_zoneidx. However, this doesn't
> take nodemask into account, and could prematurely wakeup kswapd on
> some unintended nodes.
> This patch uses for_each_zone_zonelist_nodemask() instead of
> for_each_zone_zonelist() in wake_all_kswapds() to avoid the above situation.
> Signed-off-by: Weijie Yang <weijie.yang@xxxxxxxxxxx>

Wow, we have never respected nodemask when waking kswapd, but your
change does make sense to me.

As far as impact go, this has the chance of reducing reclaim/swapping
for certain configurations. Higher-order wakeups on an ineligible
zone are more obviously undesirable, but even order-0 rebalancing is
not necessarily a future investment for other allocations on that
node, as other allocations may have access to the free pages of a
third node and overall demand might drop before these are exhausted.
This reminds me of the issue fixed in 3a025760fc15 ("mm: page_alloc:
spill to remote nodes before waking kswapd"), where accidental eager
order-0 rebalancing turned out to be a true waste.

Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx>
