Re: [RFC PATCH] mm, oom: move GFP_NOFS check to out_of_memory

From: Michal Hocko
Date: Wed Mar 30 2016 - 05:47:57 EST


On Tue 29-03-16 15:13:54, David Rientjes wrote:
> On Tue, 29 Mar 2016, Michal Hocko wrote:
>
> > diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> > index 86349586eacb..1c2b7a82f0c4 100644
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -876,6 +876,10 @@ bool out_of_memory(struct oom_control *oc)
> > return true;
> > }
> >
> > + /* The OOM killer does not compensate for IO-less reclaim. */
> > + if (!(oc->gfp_mask & __GFP_FS))
> > + return true;
> > +
> > /*
> > * Check if there were limitations on the allocation (only relevant for
> > * NUMA) that may require different handling.
>
> I don't object to this necessarily, but I think we need input from those
> that have taken the time to implement their own oom notifier to see if
> they agree. In the past, they would only be called if reclaim has
> completely failed; now, they can be called in low memory situations when
> reclaim has had very little chance to be successful. Getting an ack from
> them would be helpful.

I will make sure to put them on the CC and mention this in the changelog
when I post this next time. I personally think that this shouldn't make
much difference in the real life because GFP_NOFS only loads are rare
and we should rather help by releasing memory when it is available
rather than rely on something else to do it for us. Waiting for Godot is
never a good strategy.

> I also think we have discussed this before, but I think the oom notifier
> handling should be in done in the page allocator proper, i.e. in
> __alloc_pages_may_oom(). We can leave out_of_memory() for a clear defined
> purpose: to kill a process when all reclaim has failed.

I vaguely remember there was some issue with that the last time we have
discussed that. It was the duplication from the page fault and allocator
paths AFAIR. Nothing that cannot be handled though but the OOM notifier
API is just too ugly to spread outside OOM proper I guess. Why we cannot
move those users to use proper shrinkers interface (after it gets
extended by a priority of some sort and release some objects only after
we are really in troubles)? Something for a separate discussion,
though...

--
Michal Hocko
SUSE Labs