Re: [PATCH] mm: shmem: use proper gfp flags for shmem_writepage()

From: Hugh Dickins
Date: Wed Nov 06 2019 - 13:59:39 EST


On Wed, 6 Nov 2019, Yang Shi wrote:
> On 11/6/19 7:18 AM, Michal Hocko wrote:
> > On Wed 06-11-19 06:02:31, Yang Shi wrote:
> > > The shmem_writepage() uses GFP_ATOMIC to allocate swap cache.
> > > GFP_ATOMIC used to mean __GFP_HIGH, but now it means __GFP_HIGH |
> > > __GFP_ATOMIC | __GFP_KSWAPD_RECLAIM. However, shmem_writepage() should
> > > write out to swap only in response to memory pressure, so
> > > __GFP_KSWAPD_RECLAIM looks useless since the caller may be kswapd itself
> > > or in direct reclaim already.
> > What kind of problem are you trying to fix here?
>
> I didn't run into any visible problem. I just happened to find this
> inconsistency when I was looking into the other problem.

Yes, I don't think it fixes any actual problem: just a cleanup to
make the two calls look the same when they don't need to be different
(whereas the call from __read_swap_cache_async() rightly uses a
lower priority gfp).

If it does fix a problem, then you need to worry also about the
* TODO: this could cause a theoretical memory reclaim
* deadlock in the swap out path.
comment still against the call in add_to_swap(): but I think that
is equally theoretical, demanding no attention since 2.6.12.

>
> The add_to_swap() does:
>
> int add_to_swap(struct page *page)
> {
> ...
> err = add_to_swap_cache(page, entry,
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ __GFP_HIGH|__GFP_NOMEMALLOC|__GFP_NOWARN);
> ...
> }
>
> Actually, shmem_writepage() does almost the same thing and both of them are
> called in reclaim context, so I didn't see why they should use different gfp
> flag. And, GFP_ATOMIC is also different from the old definition as I
> mentioned in the commit log.
>
> >
> > > In addition, XArray node allocations from PF_MEMALLOC contexts could
> > > completely exhaust the page allocator, __GFP_NOMEMALLOC stops emergency
> > > reserves from being allocated.
> > I am not really familiar with XArray much, could you be more specific
> > please?
>
> It comes from the comments of add_to_swap(), says:
>
> /*
> ÂÂÂÂÂÂÂÂ * XArray node allocations from PF_MEMALLOC contexts could
> ÂÂÂÂÂÂÂÂ * completely exhaust the page allocator. __GFP_NOMEMALLOC
> ÂÂÂÂÂÂÂÂ * stops emergency reserves from being allocated.
>
> And, it looks the original comment came from pre-git time, TBH I'm not quite
> sure about the specific problem which incurred this. I suspect it may be
> because PF_MEMALLOC context allows ALLOC_NO_WATERMARK.
>
> >
> > > Here just copy the gfp flags used by add_to_swap().
> > >
> > > Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> > > Signed-off-by: Yang Shi <yang.shi@xxxxxxxxxxxxxxxxx>
> > > ---
> > > mm/shmem.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/shmem.c b/mm/shmem.c
> > > index 220be9f..9691dec 100644
> > > --- a/mm/shmem.c
> > > +++ b/mm/shmem.c
> > > @@ -1369,7 +1369,8 @@ static int shmem_writepage(struct page *page,
> > > struct writeback_control *wbc)
> > > if (list_empty(&info->swaplist))
> > > list_add(&info->swaplist, &shmem_swaplist);
> > > - if (add_to_swap_cache(page, swap, GFP_ATOMIC) == 0) {
> > > + if (add_to_swap_cache(page, swap,
> > > + __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN) == 0) {
> > > spin_lock_irq(&info->lock);
> > > shmem_recalc_inode(inode);
> > > info->swapped++;
> > > --
> > > 1.8.3.1