Re: [PATCH 08/31] mm, vmscan: simplify the logic deciding whether kswapd sleeps

From: Mel Gorman
Date: Thu Jul 14 2016 - 05:05:15 EST


On Thu, Jul 14, 2016 at 02:23:32PM +0900, Joonsoo Kim wrote:
> >
> > > > > And, I'd like to know why max() is used for classzone_idx rather than
> > > > > min()? I think that kswapd should balance the lowest zone requested.
> > > > >
> > > >
> > > > If there are two allocation requests -- one zone-constraned and the other
> > > > zone-unconstrained, it does not make sense to have kswapd skip the pages
> > > > usable for the zone-unconstrained and waste a load of CPU. You could
> > >
> > > I agree that, in this case, it's not good to skip the pages usable
> > > for the zone-unconstrained request. But, what I am concerned is that
> > > kswapd stop reclaim prematurely in the view of zone-constrained
> > > requestor.
> >
> > It doesn't stop reclaiming for the lower zones. It's reclaiming the LRU
> > for the whole node that may or may not have lower zone pages at the end
> > of the LRU. If it does, then the allocation request will be satisfied.
> > If it does not, then kswapd will think the node is balanced and get
> > rewoken to do a zone-constrained reclaim pass.
>
> If zone-constrained request could go direct reclaim pass, there would
> be no problem. But, please assume that request is zone-constrained
> without __GFP_DIRECT_RECLAIM which is common for some device driver
> implementation.

Then it's likely GFP_ATOMIC and it'll wake kswapd on each failure. If
kswapd is containtly awake for highmem requests then we're reclaiming
everything anyway. Remember that if kswapd is reclaiming for higher zones,
it'll still cover the lower zones eventually. There is no guarantee that
skipping the highmem pages will satisfy the atomic allocations any faster
but consuming the CPU to skip the pages is a definite cost.

Even worse, skipping highmem pages when a highmem pages are required may
ake lowmem pressure worse because those pages are freed faster and can
be consumed by zone-unconstrained requests.

If this really is a problem in practice then we can consider having
allocation requests that are zone-constrained and !__GFP_DIRECT_RECLAIM
set a flag and use the min classzone for the wakeup. That flag remains
set until kswapd takes at least one pass using the lower classzone and
clears it. The classzone will not be adjusted higher until that flag is
cleared. I don't think we should do it without evidence that it's a real
problem because kswapd potentially uses useless CPU and the potential for
higher lowmem pressure.

--
Mel Gorman
SUSE Labs