Re: [PATCH 3/6] EXT4: Remove ENOMEM/congestion_wait() loops.

From: Michal Hocko
Date: Wed Sep 15 2021 - 08:20:32 EST


On Wed 15-09-21 09:59:04, Mel Gorman wrote:
> On Wed, Sep 15, 2021 at 09:55:35AM +1000, Dave Chinner wrote:

> > That way "GFP_RETRY_FOREVER" allocation contexts don't have to jump
> > through an ever changing tangle of hoops to make basic "never-fail"
> > allocation semantics behave correctly.
> >
>
> True and I can see what that is desirable. What I'm saying is that right
> now, increasing the use of __GFP_NOFAIL may cause a different set of
> problems (unbounded retries combined with ATOMIC allocation failures) as
> they compete for similar resources.

I have commented on reasoning behind the above code in other reply. Let
me just comment on this particular concern. I completely do agree that
any use of __GFP_NOFAIL should be carefully evaluated. This is a very
strong recuirement and it should be used only as a last resort.
On the other hand converting an existing open coded nofail code that
_doesn't_ really do any clever tricks to allow a forward progress (e.g.
dropping locks, kicking some internal caching mechinisms etc.) should
just be turned into __GPF_NOFAIL. Not only it makes it easier to spot
that code but it also allows the page allocator to behave consistently
and predictably.

If the existing heuristic wrt. memory reserves to GFP_NOFAIL turns out
to be suboptimal we can fix it for all those users.

Dropping the rest of the email which talks about reclaim changes because
I will need much more time to digest that.
[...]
--
Michal Hocko
SUSE Labs