Re: [PATCH v2] cgroup: Wait for dying tasks to leave on rmdir
From: Tejun Heo
Date: Tue Mar 24 2026 - 16:18:11 EST
Hello,
On Tue, Mar 24, 2026 at 10:04:02AM +0100, Sebastian Andrzej Siewior wrote:
...
> As mentioned in the other email, if I
> - irq_work_queue(this_cpu_ptr(&cgrp_dead_tasks_iwork));
> + schedule_delayed_work(this_cpu_ptr(&cgrp_delayed_tasks_iwork), 1 * HZ);
>
> then I hung at boot because it rmdir() a cgroup with a task in Z. It
> might suggest a race because systemd might missed a task.
> But this fixes the other issue so.
Just did 100 boot test w/ 1s delay added as above but the problem didn't
reproduce. Can't reproduce with cgroup create / populate / depopulate /
rmdir stress tests either. I did hit 1s delay propagating through but that
wasn't a dead lock. The code is not great. It'd be better to just keep
css_set_lock held while iterating too.
I'll apply this for now. Can you please try to reproduce the problem with
the patches applied? How reliably does it reproduce? How is it stuck? Are
tasks waiting on the waitq indefinitely with populated stuck at 1?
Thanks.
--
tejun