Re: [RFC] panic_on_oom_timeout

From: Michal Hocko
Date: Wed Jun 10 2015 - 12:02:45 EST


On Wed 10-06-15 16:28:01, Michal Hocko wrote:
> On Wed 10-06-15 21:20:58, Tetsuo Handa wrote:
[...]
> > Since my version uses per a "struct task_struct" variable (memdie_start),
> > 5 seconds of timeout is checked for individual memory cgroup. It can avoid
> > unnecessary panic() calls if nobody needs to call out_of_memory() again
> > (probably because somebody volunteered memory) when the OOM victim cannot
> > be terminated for some reason. If we want distinction between "the entire
> > system is under OOM" and "some memory cgroup is under OOM" because the
> > former is urgent but the latter is less urgent, it can be modified to
> > allow different timeout period for system-wide OOM and cgroup OOM.
> > Finally, it can give a hint for "in what sequence threads got stuck" and
> > "which thread did take 5 seconds" when analyzing vmcore.
>
> I will have a look how you have implemented that but separate timeouts
> sound like a major over engineering. Also note that global vs. memcg OOM
> is not sufficient because there are other oom domains as mentioned above.

Your patch is doing way too many things at once :/ So let me just focus
on the "panic if a task is stuck with TIF_MEMDIE for too long". It looks
like an alternative to the approach I've chosen. It doesn't consider
the allocation restriction so a locked up cpuset/numa node(s) might
panic the system which doesn't sound like a good idea but that is easily
fixable. Could you tear just this part out and repost it so that we can
compare the two approaches?

The panic_on_oom=2 would be still weird because some nodes might stay in
OOM condition without triggering the panic but maybe this is acceptable.

Thanks!
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/