Re: [RFC PATCH] mm: check global free_list if there is ongoing reclaiming when pcp fail

From: Mel Gorman
Date: Tue Sep 20 2022 - 04:46:23 EST


On Tue, Sep 20, 2022 at 09:45:35AM +0800, Zhaoyang Huang wrote:
> On Mon, Sep 19, 2022 at 6:22 PM Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Fri, Sep 16, 2022 at 06:58:12PM +0800, zhaoyang.huang wrote:
> > > From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
> > >
> > > Check the global free list again even if rmqueue_bulk failed for pcp pages when
> > > there is ongoing reclaiming, which could eliminate potential direct reclaim by
> > > chance.
> > >
> > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
> >
> > Patch does not apply and may be based on a custom kernel that introduced
> > a problem. There is no description of what problem this is trying to
> > fix. Checking the status of reclaim for a specific zone in this path would
> > be a little unexpected. If allocation pressure is exceeding the ability
> > of reclaim to make progress then the caller likely needs to take action
> > like direct reclaim. If the allocation failure is due to a high-order
> > failure then it may need to enter direct compaction etc.
>
> Agree with the above comment. This is a proposal aiming at avoiding
> direct reclaiming things with minimum cost, that is to say, about 5
> CPU instructions in return with the overhead of function calls which
> has both of several loops inside and potential throttle sleep by IO
> congestion etc.

If the refill fails and kswapd is failing to keep up then actions like
direct reclaim or compaction are inevitable. At best, this patch would
race to allocate pages in one context that are being freed in parallel by
another context.

Nak.

--
Mel Gorman
SUSE Labs