Re: [PATCH] oom: always panic on OOM when panic_on_oom is configured
From: Michal Hocko
Date: Fri Jun 05 2015 - 07:13:12 EST
On Thu 04-06-15 16:12:27, David Rientjes wrote:
> On Mon, 1 Jun 2015, Michal Hocko wrote:
>
> > panic_on_oom allows administrator to set OOM policy to panic the system
> > when it is out of memory to reduce failover time e.g. when resolving
> > the OOM condition would take much more time than rebooting the system.
> >
> > out_of_memory tries to be clever and prevent from premature panics
> > by checking the current task and prevent from panic when the task
> > has fatal signal pending and so it should die shortly and release some
> > memory. This is fair enough but Tetsuo Handa has noted that this might
> > lead to a silent deadlock when current cannot exit because of
> > dependencies invisible to the OOM killer.
> >
> > panic_on_oom is disabled by default and if somebody enables it then any
> > risk of potential deadlock is certainly unwelcome. The risk is really
> > low because there are usually more sources of allocation requests and
> > one of them would eventually trigger the panic but it is better to
> > reduce the risk as much as possible.
> >
> > Let's move check_panic_on_oom up before the current task is
> > checked so that the knob value is . Do the same for the memcg in
> > mem_cgroup_out_of_memory.
> >
> > Reported-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> > Signed-off-by: Michal Hocko <mhocko@xxxxxxx>
>
> Nack, this is not the appropriate response to exit path livelocks. By
> doing this, you are going to start unnecessarily panicking machines that
> have panic_on_oom set when it would not have triggered before. If there
> is no reclaimable memory and a process that has already been signaled to
> die to is in the process of exiting has to allocate memory, it is
> perfectly acceptable to give them access to memory reserves so they can
> allocate and exit. Under normal circumstances, that allows the process to
> naturally exit. With your patch, it will cause the machine to panic.
Isn't that what the administrator of the system wants? The system
is _clearly_ out of memory at this point. A coincidental exiting task
doesn't change a lot in that regard. Moreover it increases a risk of
unnecessarily unresponsive system which is what panic_on_oom tries to
prevent from. So from my POV this is a clear violation of the user
policy.
> It's this simple: panic_on_oom is not a solution to workaround oom killer
> livelocks and shouldn't be suggested as the canonical way that such
> possibilities should be addressed.
I wasn't suggesting that at all.
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/