Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update

From: Christoph Lameter
Date: Sun Apr 30 2017 - 17:33:37 EST


On Wed, 26 Apr 2017, Vlastimil Babka wrote:

> > Such an application typically already has such logic and executes a
> > binding after discovering its numa node configuration on startup. It would
> > have to be modified to redo that action when it gets some sort of a signal
> > from the script telling it that the node config would be changed.
> >
> > Having this logic in the application instead of the kernel avoids all the
> > kernel messes that we keep on trying to deal with and IMHO is much
> > cleaner.
>
> That would be much simpler for us indeed. But we still IMHO can't
> abruptly start denying page fault allocations for existing applications
> that don't have the necessary awareness.

We certainly can do that. The failure of the page faults are due to the
admin trying to move an application that is not aware of this and is using
mempols. That could be an error. Trying to move an application that
contains both absolute and relative node numbers is definitely something
that is potentiall so screwed up that the kernel should not muck around
with such an app.

Also user space can determine if the application is using memory policies
and can then take appropriate measures (message to the sysadmin to eval
tge situation f.e.) or mess aroud with the processes memory policies on
its own.

So this is certainly a way out of this mess.