Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

From: Tejun Heo
Date: Wed Apr 13 2016 - 14:33:17 EST


Hello, Petr.

(cc'ing Johannes)

On Wed, Apr 13, 2016 at 11:42:16AM +0200, Petr Mladek wrote:
...
> By other words, "memcg_move_char/2860" flushes a work. But it cannot
> get flushed because one worker is blocked and another one could not
> get created. All these operations are blocked by the very same
> "memcg_move_char/2860".
>
> Note that also "systemd/1" is waiting for "cgroup_mutex" in
> proc_cgroup_show(). But it seems that it is not in the main
> cycle causing the deadlock.
>
> I am able to reproduce this problem quite easily (within few minutes).
> There are often even more tasks waiting for the cgroups-related locks
> but they are not causing the deadlock.
>
>
> The question is how to solve this problem. I see several possibilities:
>
> + avoid using workqueues in lru_add_drain_all()
>
> + make lru_add_drain_all() killable and restartable
>
> + do not block fork() when lru_add_drain_all() is running,
> e.g. using some lazy techniques like RCU, workqueues
>
> + at least do not block fork of workers; AFAIK, they have a limited
> cgroups usage anyway because they are marked with PF_NO_SETAFFINITY
>
>
> I am willing to test any potential fix or even work on the fix.
> But I do not have that big insight into the problem, so I would
> need some pointers.

An easy solution would be to make lru_add_drain_all() use a
WQ_MEM_RECLAIM workqueue. A better way would be making charge moving
asynchronous similar to cpuset node migration but I don't know whether
that's realistic. Will prep a patch to add a rescuer to
lru_add_drain_all().

Thanks.

--
tejun