Re: [RFC PATCH 3/4] xfs: map KM_MAYFAIL to __GFP_RETRY_MAYFAIL
From: Michal Hocko
Date: Wed Mar 08 2017 - 08:14:57 EST
On Wed 08-03-17 20:23:37, Tetsuo Handa wrote:
> On 2017/03/08 0:48, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@xxxxxxxx>
> > KM_MAYFAIL didn't have any suitable GFP_FOO counterpart until recently
> > so it relied on the default page allocator behavior for the given set
> > of flags. This means that small allocations actually never failed.
> > Now that we have __GFP_RETRY_MAYFAIL flag which works independently on the
> > allocation request size we can map KM_MAYFAIL to it. The allocator will
> > try as hard as it can to fulfill the request but fails eventually if
> > the progress cannot be made.
> > Cc: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
> > ---
> > fs/xfs/kmem.h | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> > diff --git a/fs/xfs/kmem.h b/fs/xfs/kmem.h
> > index ae08cfd9552a..ac80a4855c83 100644
> > --- a/fs/xfs/kmem.h
> > +++ b/fs/xfs/kmem.h
> > @@ -54,6 +54,16 @@ kmem_flags_convert(xfs_km_flags_t flags)
> > lflags &= ~__GFP_FS;
> > }
> > + /*
> > + * Default page/slab allocator behavior is to retry for ever
> > + * for small allocations. We can override this behavior by using
> > + * __GFP_RETRY_MAYFAIL which will tell the allocator to retry as long
> > + * as it is feasible but rather fail than retry for ever for all
> > + * request sizes.
> > + */
> > + if (flags & KM_MAYFAIL)
> > + lflags |= __GFP_RETRY_MAYFAIL;
> I don't see advantages of supporting both __GFP_NORETRY and __GFP_RETRY_MAYFAIL.
> kmem_flags_convert() can always set __GFP_NORETRY because the callers use
> opencoded __GFP_NOFAIL loop (with possible allocation lockup warning) unless
> KM_MAYFAIL is set.
The behavior would be different (e.g. the OOM killer handling).
> line, which is likely always true); but this is off-topic for this thread.
> where both __GFP_NORETRY and __GFP_RETRY_MAYFAIL are checked after
> direct reclaim and compaction failed. __GFP_RETRY_MAYFAIL optimistically
> retries based on one of should_reclaim_retry() or should_compact_retry()
> or read_mems_allowed_retry() returns true or mutex_trylock(&oom_lock) in
> __alloc_pages_may_oom() returns 0. If !__GFP_FS allocation requests are
> holding oom_lock each other, __GFP_RETRY_MAYFAIL allocation requests (which
> are likely !__GFP_FS allocation requests due to __GFP_FS allocation requests
> being blocked on direct reclaim) can be blocked for uncontrollable duration
> without making progress. It seems to me that the difference between
> __GFP_NORETRY and __GFP_RETRY_MAYFAIL is not useful. Rather, the caller can
> set __GFP_NORETRY and retry with any control (e.g. set __GFP_HIGH upon first
> timeout, give up upon second timeout).
You are drown in implementation details here. Try to step back and think
about the high level semantic I would like to achieve - which is
essentially a middle ground between __GFP_NORETRY which doesn't retry
and __GFP_NOFAIL to retry for ever. There are users who could benefit
from such a semantic I believe (the most prominent example is kvmalloc
which has different modes of how hard to try kmalloc before giving up
and falling back to vmalloc)..