Re: [RFC PATCH 2/2] xfs: map KM_MAYFAIL to __GFP_RETRY_HARD

From: Michal Hocko
Date: Thu Jun 16 2016 - 07:26:17 EST


On Thu 16-06-16 10:03:55, Michal Hocko wrote:
> On Thu 16-06-16 10:23:02, Dave Chinner wrote:
> > On Mon, Jun 06, 2016 at 01:32:16PM +0200, Michal Hocko wrote:
> > > From: Michal Hocko <mhocko@xxxxxxxx>
> > >
> > > KM_MAYFAIL didn't have any suitable GFP_FOO counterpart until recently
> > > so it relied on the default page allocator behavior for the given set
> > > of flags. This means that small allocations actually never failed.
> > >
> > > Now that we have __GFP_RETRY_HARD flags which works independently on the
> > > allocation request size we can map KM_MAYFAIL to it. The allocator will
> > > try as hard as it can to fulfill the request but fails eventually if
> > > the progress cannot be made.
> > >
> > > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
> > > ---
> > > fs/xfs/kmem.h | 3 +++
> > > 1 file changed, 3 insertions(+)
> > >
> > > diff --git a/fs/xfs/kmem.h b/fs/xfs/kmem.h
> > > index 689f746224e7..34e6b062ce0e 100644
> > > --- a/fs/xfs/kmem.h
> > > +++ b/fs/xfs/kmem.h
> > > @@ -54,6 +54,9 @@ kmem_flags_convert(xfs_km_flags_t flags)
> > > lflags &= ~__GFP_FS;
> > > }
> > >
> > > + if (flags & KM_MAYFAIL)
> > > + lflags |= __GFP_RETRY_HARD;
> > > +
> >
> > I don't understand. KM_MAYFAIL means "caller handles
> > allocation failure, so retry on failure is not required." To then
> > map KM_MAYFAIL to a flag that implies the allocation will internally
> > retry to try exceptionally hard to prevent failure seems wrong.
>
> The primary point, which I've tried to describe in the changelog, is
> that the default allocator behavior is to retry endlessly for small
> orders. You can override this by using __GFP_NORETRY which doesn't retry
> at all and fails quite early. My understanding of KM_MAYFAIL is that
> it can cope with allocation failures. The lack of __GFP_NORETRY made me
> think that the failure should be prevented as much as possible.
> __GFP_RETRY_HARD is semantically somwhere in the middle between
> __GFP_NORETRY and __GFP_NOFAIL semantic independently on the allocation
> size.
>
> Does that make more sense now?

I would add the following explanation into the code:
diff --git a/fs/xfs/kmem.h b/fs/xfs/kmem.h
index 34e6b062ce0e..10708f065191 100644
--- a/fs/xfs/kmem.h
+++ b/fs/xfs/kmem.h
@@ -54,6 +54,13 @@ kmem_flags_convert(xfs_km_flags_t flags)
lflags &= ~__GFP_FS;
}

+ /*
+ * Default page/slab allocator behavior is to retry for ever
+ * for small allocations. We can override this behavior by using
+ * __GFP_RETRY_HARD which will tell the allocator to retry as long
+ * as it is feasible but rather fail than retry for ever for all
+ * request sizes.
+ */
if (flags & KM_MAYFAIL)
lflags |= __GFP_RETRY_HARD;


--
Michal Hocko
SUSE Labs