Re: Please backport commit 3812c8c8f39 to stable

From: Michal Hocko
Date: Tue Oct 07 2014 - 08:23:41 EST


On Fri 03-10-14 11:03:30, Cong Wang wrote:
> On Fri, Oct 3, 2014 at 8:13 AM, Michal Hocko <mhocko@xxxxxxx> wrote:
> >
> > That commit fixes an OOM deadlock. Not a soft lockup. Do you have the
> > OOM killer report from the log? This would tell us that the killed task
> > was indeed sleeping on the lock which is hold by the charger which
> > triggered the OOM. I am little bit surprised that I do not see any OOM
> > related functions on the stacks (maybe the code is inlined...).
>
>
> Oh, did you see __mem_cgroup_try_charge() calls
> schedule_timeout_uninterruptible() in stack trace? Yes, they are inlined
> and I don't see any other possibilities for calling it.

Yes the only place we call schedule_timeout_uninterruptible from is
mem_cgroup_handle_oom. And it happens only for a task which hasn't been
killed by OOM killer.

> > It would be better to know what exactly is going on before backporting
> > this change because it is quite large.
> >
>
> I thought the stack trace I showed is obvious. :) I am very happy
> to investigate if you see any other path calling
> schedule_timeout_uninterruptible()
> in __mem_cgroup_try_charge().

I was expecting an oom report which kills a task which is sleeping on a
lock which is held on the way up to the charge function. Your report
mentioned a task waiting for i_mutex for too long. It is true that the
charging path is holding an i_mutex as well so it might be the same
situation handled by the said patch. But it is not 100% clear this is
the case without an OOM report which would point to the waiting task.
The memcg might be trashing on the hard limit and reclaim might take a
long time.

--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/