Re: [PATCH 2/2] mm, oom: do not enfore OOM killer for __GFP_NOFAIL automatically

From: Michal Hocko
Date: Tue Dec 06 2016 - 14:25:52 EST


On Tue 06-12-16 12:03:02, Vlastimil Babka wrote:
> On 12/06/2016 11:38 AM, Tetsuo Handa wrote:
> >>
> >> So we are somewhere in the middle between pre-mature and pointless
> >> system disruption (GFP_NOFS with a lots of metadata or lowmem request)
> >> where the OOM killer even might not help and potential lockup which is
> >> inevitable with the current design. Dunno about you but I would rather
> >> go with the first option. To be honest I really fail to understand your
> >> line of argumentation. We have this
> >> do {
> >> cond_resched();
> >> } while (!(page = alloc_page(GFP_NOFS)));
> >> vs.
> >> page = alloc_page(GFP_NOFS | __GFP_NOFAIL);
> >>
> >> the first one doesn't invoke OOM killer while the later does. This
> >> discrepancy just cannot make any sense... The same is true for
> >>
> >> alloc_page(GFP_DMA) vs alloc_page(GFP_DMA|__GFP_NOFAIL)
> >>
> >> Now we can discuss whether it is a _good_ idea to not invoke OOM killer
> >> for those exceptions but whatever we do __GFP_NOFAIL is not a way to
> >> give such a subtle side effect. Or do you disagree even with that?
> >
> > "[PATCH 1/2] mm: consolidate GFP_NOFAIL checks in the allocator slowpath"
> > silently changes __GFP_NOFAIL vs. __GFP_NORETRY priority.
>
> I guess that wasn't intended?

I even didn't think about that possibility because it just doesn't make
any sense.

> > Currently, __GFP_NORETRY is stronger than __GFP_NOFAIL; __GFP_NOFAIL
> > allocation requests fail without invoking the OOM killer when both
> > __GFP_NORETRY and __GFP_NOFAIL are given.
> >
> > With [PATCH 1/2], __GFP_NOFAIL becomes stronger than __GFP_NORETRY;
> > __GFP_NOFAIL allocation requests will loop forever without invoking
> > the OOM killer when both __GFP_NORETRY and __GFP_NOFAIL are given.
>
> Does such combination of flag make sense? Should we warn about it, or
> even silently remove __GFP_NORETRY in such case?

No this combination doesn't make any sense. I seriously doubt we should
even care about it and simply following the stronger requirement makes
more sense from a semantic point of view.

--
Michal Hocko
SUSE Labs