Re: [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range

From: Michal Hocko
Date: Tue Jan 26 2021 - 12:38:04 EST


On Mon 25-01-21 11:33:36, Minchan Kim wrote:
> On Mon, Jan 25, 2021 at 02:12:00PM +0100, Michal Hocko wrote:
> > On Thu 21-01-21 09:55:00, Minchan Kim wrote:
> > > Contiguous memory allocation can be stalled due to waiting
> > > on page writeback and/or page lock which causes unpredictable
> > > delay. It's a unavoidable cost for the requestor to get *big*
> > > contiguous memory but it's expensive for *small* contiguous
> > > memory(e.g., order-4) because caller could retry the request
> > > in different range where would have easy migratable pages
> > > without stalling.
> > >
> > > This patch introduce __GFP_NORETRY as compaction gfp_mask in
> > > alloc_contig_range so it will fail fast without blocking
> > > when it encounters pages needed waiting.
> >
> > I am not against controling how hard this allocator tries with gfp mask
> > but this changelog is rather void on any data and any user.
> >
> > It is also rather dubious to have retries when then caller says to not
> > retry.
>
> Since max_tries is 1 with ++tries, it shouldn't retry.

OK, I have missed that. This is a tricky code. ASYNC mode should be
completely orthogonal to the retries count. Those are different things.
Page allocator does an explicit bail out based on __GFP_NORETRY. You
should be doing the same.

> >
> > Also why didn't you consider GFP_NOWAIT semantic for non blocking mode?
>
> GFP_NOWAIT seems to be low(specific) flags rather than the one I want to
> express. Even though I said only page writeback/lock in the description,
> the goal is to avoid costly operations we might find later so such
> "failfast", I thought GFP_NORETRY would be good fit.

I suspect you are too focused on implementation details here. Think
about the indended semantic. Callers of this functionality will not
think about those (I hope because if they rely on these details then the
whole thing will become unmaintainable because any change would require
an audit of all existing users). All you should be caring about is to
control how expensive the call can be. GFP_NOWAIT is not really low
level from that POV. It gives you a very lightweight non-sleeping
attempt to allocate. GFP_NORETRY will give you potentially sleeping but
an opportunistic-easy-to-fail attempt. And so on. See how that is
absolutely free of any page writeback or any specific locking.
--
Michal Hocko
SUSE Labs