Re: [PATCH] MM: discard __GFP_ATOMIC

From: Michal Hocko
Date: Mon Nov 22 2021 - 11:54:30 EST

On Sat 20-11-21 21:51:20, Neil Brown wrote:
> On Sat, 20 Nov 2021, Matthew Wilcox wrote:
> > On Fri, Nov 19, 2021 at 10:14:38AM +1100, NeilBrown wrote:
> > > On Thu, 18 Nov 2021, Matthew Wilcox wrote:
> > > > Surely this should be gfpflags_allow_blocking() instead of poking about
> > > > in the innards of gfp flags?
> > >
> > > Possibly. Didn't know about gfpflags_allow_blocking(). From a quick
> > > grep in the kernel, a whole lot of other people don't know about it
> > > either, though clearly some do.
> > >
> > > Maybe we should reaname "__GFP_DIRECT_RECLAIM" to
> > > "__GFP_ALLOW_BLOCKING", because that is what most users seems to care
> > > about.
> >
> > I tend towards the school of thought that the __GFP flags should make
> > sense to the implementation and users should use either GFP_ or functions.
> > When we see users adding or subtracting __GFP flags, that's a problem.
> Except __GFP_NOWARN of course, or __GFP_ZERO, or __GFP_NOFAIL.
> What about __GFP_HIGHMEM? __GFP_DMA? __GFP_HIGH?
> They all seem to be quite meaningful to the caller - explicitly
> specifying properties of the memory or properties of the service.
> (But maybe you would prefer __GFP_HIGH be spelled "__GFP_LOW_WATERMARK"
> so it would make more sense to the implementation).
> __GFP_DIRECTRECLAIM seems to me to be more the exception than the rule -
> specifying internal implementation details.

I do not think it is viable to fix up gfp flags to be consistent :/
Both __GFP_DIRECT_RECLAIM and __GFP_KSWAPD_RECLAIM are way too lowlevel
but historically we've had requests to inhibit kswapd for a particular
requests because that has led to problems - fun reading caf491916b1c1.
__GFP_ALLOW_BLOCKING would make a lot of sense but I am not sure it
would be a good match to __GFP_KSWAPD_RECLAIM.

> Actually ... I take it back about __GFP_NOWARN. That probably shouldn't
> exist at all. Warnings should be based on how stressed the mm system is,
> not on whether the caller wants thinks failure is manageable.

Unless we change the way when allocation warnings are triggered then we
really need this. There are many opportunistic allocations with a
fallback behavior which do not want to swamp kernel logs with failures
that are of no use. Think of a THP allocation that really want to be
just very quick and falls back to normal base pages otherwise. Deducing
context which is just fine to not report failures is quite tricky and it
can get wrong easily. Callers should know whether warning can be of any
use in many cases.
Michal Hocko