Re: [PATCH 1/2] mm, oom: Give __GFP_NOFAIL allocations access to memory reserves

From: Michal Hocko
Date: Wed Nov 25 2015 - 06:18:10 EST


On Wed 25-11-15 02:51:38, David Rientjes wrote:
> On Wed, 25 Nov 2015, Michal Hocko wrote:
>
> > From: Michal Hocko <mhocko@xxxxxxxx>
> >
> > __GFP_NOFAIL is a big hammer used to ensure that the allocation
> > request can never fail. This is a strong requirement and as such
> > it also deserves a special treatment when the system is OOM. The
> > primary problem here is that the allocation request might have
> > come with some locks held and the oom victim might be blocked
> > on the same locks. This is basically an OOM deadlock situation.
> >
> > This patch tries to reduce the risk of such a deadlocks by giving
> > __GFP_NOFAIL allocations a special treatment and let them dive into
> > memory reserves after oom killer invocation. This should help them
> > to make a progress and release resources they are holding. The OOM
> > victim should compensate for the reserves consumption.
> >
> > Suggested-by: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
> > ---
> > mm/page_alloc.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 8034909faad2..70db11c27046 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -2766,8 +2766,13 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
> > goto out;
> > }
> > /* Exhausted what can be done so it's blamo time */
> > - if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL))
> > + if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) {
> > *did_some_progress = 1;
> > +
> > + if (gfp_mask & __GFP_NOFAIL)
> > + page = get_page_from_freelist(gfp_mask, order,
> > + ALLOC_NO_WATERMARKS|ALLOC_CPUSET, ac);
> > + }
> > out:
> > mutex_unlock(&oom_lock);
> > return page;
>
> I don't understand why you're setting ALLOC_CPUSET if you're giving them
> "special treatment". If you want to allow access to memory reserves to
> prevent an oom livelock, then why not also allow it access to allocate
> outside its cpuset?

Good question. My thinking was that __GFP_NOFAIL allocations might be
done on behalf on a process so they are not necessarily system wide. We
do the same before we actually go to out_of_memory. On the other hand
__GFP_NOFAIL should be used really rarely and so breaking the cpuset
restriction shouldn't be a big deal if that helps to break out from the
potential OOM deadlock. I will drop it.

Thanks!
---