Re: [PATCH] oom: split out forced OOM killer

From: Michal Hocko
Date: Tue Jun 09 2015 - 05:37:11 EST


On Mon 08-06-15 16:06:07, David Rientjes wrote:
> On Mon, 8 Jun 2015, Michal Hocko wrote:
>
> > > This patch is not a functional change, so I don't interpret your feedback
> > > as any support of it being merged.
> >
> > David, have you actually read the patch? The changelog is mentioning this:
> > "
> > check_panic_on_oom on the other hand will work and that is kind of
> > unexpected because sysrq+f should be usable to kill a mem hog whether
> > the global OOM policy is to panic or not.
> > It also doesn't make much sense to panic the system when no task cannot
> > be killed because admin has a separate sysrq for that purpose.
> > "
> > and the patch exludes panic_on_oom from the sysrq path.
> >
>
> Yes, and that's why I believe we should pursue that direction without the
> associated "cleanup" that adds 35 lines of code to supress a panic. In
> other words, there's no reason to combine a patch that suppresses the
> panic even with panic_on_oom, which I support, and a "cleanup" that I
> believe just obfuscates the code.
>
> It's a one-liner change: just test for force_kill and suppress the panic;
> we don't need 35 new lines that create even more unique entry paths.

I completely detest yet another check in out_of_memory. And there is
even no reason to do that. Forced kill and genuine oom have different
objectives and combining those two just makes the code harder to read
(one has to go to check the syrq callback to realize that the forced
path is triggered from the workqueue context and that current->mm !=
NULL check will prevent some heuristics. This is just too ugly to
live). So why the heck are you pushing for keeping everything in a
single path?

That being said, I have no problem to do 3 patches, where two of them
would add force check for check_panic_on_oom and panic on no killable
task and only then pull out force_out_of_memory to make it readable
again and drop force checks but I do not see much point in this
juggling.

> > > That said, you raise an interesting point of whether sysrq+f should ever
> > > trigger a panic due to panic_on_oom. The case can be made that it should
> > > ignore panic_on_oom and require the use of another sysrq to panic the
> > > machine instead. Sysrq+f could then be used to oom kill a process,
> > > regardless of panic_on_oom, and the panic only occurs if userspace did not
> > > trigger the kill or the kill itself will fail.
> >
> > Why would it panic the system if there is no killable task? Shoudln't
> > be admin able to do additional steps after the explicit oom killer failed
> > and only then panic by sysrq?
> >
>
> Today it panics, I don't think it should panic when there are no killable
> processes because it's inherently racy with userspace. It's similar to
> suppressing panic_on_oom for sysrq+f, but for a different reason, so it
> should probably be a separate patch with its own changelog (and update to
> documentation for both patches to make this explicit).

I have no problem to be more explicit about the behavior of course. I
can fold it to the original patch.
---