Re: bug in memcg oom-killer results in a hung syscall in another process in the same cgroup

From: Oleg Nesterov
Date: Mon Jul 18 2016 - 09:53:00 EST


On 07/15, Shayan Pooya wrote:
>
> >> --- x/kernel/sched/core.c
> >> +++ x/kernel/sched/core.c
> >> @@ -2793,8 +2793,11 @@ asmlinkage __visible void schedule_tail(struct task_struct *prev)
> >> balance_callback(rq);
> >> preempt_enable();
> >>
> >> - if (current->set_child_tid)
> >> + if (current->set_child_tid) {
> >> + mem_cgroup_oom_enable();
> >> put_user(task_pid_vnr(current), current->set_child_tid);
> >> + mem_cgroup_oom_disable();
> >> + }
> >> }
> >>
> >> /*
>
> I tried this patch and I still see the same stuck processes (assuming
> that's what you were curious about).

Of course. Because I am stupid. Firtsly, I forgot to include another
change in fault.c. And now I see that change was wrong anyway.

I'll try to make another debugging patch today later, but let me repeat
that it won't fix the real problem anyway.

Thanks, and sorry for wasting your time.

Oleg.