Re: [patch -mm] mm, oom: add global access to memory reserves on livelock

From: David Rientjes
Date: Thu Aug 27 2015 - 16:52:53 EST


On Thu, 27 Aug 2015, Michal Hocko wrote:

> > If Andrew would prefer moving in a direction where all Linux users are
> > required to have their admin use sysrq+f to manually trigger an oom kill,
> > which may or may not resolve the livelock since there's no way to
> > determine which process is holding the common mutex (or even which
> > processes are currently allocating), in such situations, then we can carry
> > this patch internally. I disagree with that solution for upstream Linux.
>
> There are other possibilities than the manual sysrq intervention. E.g.
> the already mentioned oom_{panic,reboot}_timeout which has a little
> advantage that it allows admin to opt in into the policy rather than
> having it hard coded into the kernel.
>

This falls under my scenario (2) from Tuesday's message:

(2) depletion of memory reserves, which can also happen today without
this patchset and we have fixed in the past.

You can deplete memory reserves today without access to global reserves on
oom livelock. I'm indifferent to whether the machine panics as soon as
memory reserves are fully depleted, independent of oom livelock and this
patch to address it, or whether there is a configurable timeout. It's an
independent issue, though, since the oom killer is not the only way for
this to happen and it seems there will be additional possibilities in the
future (the __GFP_NOFAIL case you bring up).

> > My patch has defined that by OOM_EXPIRE_MSECS. The premise is that an oom
> > victim with full access to memory reserves should never take more than 5s
> > to exit, which I consider a very long time. If it's increased, we see
> > userspace responsiveness issues with our processes that monitor system
> > health which timeout.
>
> Yes but it sounds very much like a policy which should better be defined
> from the userspace because different users might have different
> preferences.
>

My patch internally actually does make this configurable through yet
another VM sysctl and it defaults to what OOM_EXPIRE_MSECS does in my
patch. We would probably never increase it, but may decrease it from the
default of 5000. I was concerned about adding another sysctl that doesn't
have a clear user. If you feel that OOM_EXPIRE_MSECS is too small and
believe there would be a user who desires their system to be livelocked
for 10s, 5m, 1h, etc, then I can add the sysctl upstream as well even it's
unjustified as far as I'm concerned.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/