Re: [PATCH v3 1/1] cgroup: fix deadlock caused by cgroup_mutex and cpu_hotplug_lock

From: Chen Ridong
Date: Wed Sep 11 2024 - 21:33:39 EST




On 2024/9/11 5:17, Tejun Heo wrote:
On Tue, Sep 10, 2024 at 09:02:59PM +0000, Roman Gushchin wrote:
...
By that reasoning any holder of cgroup_mutex on system_wq makes system
susceptible to a deadlock (in presence of cpu_hotplug_lock waiting
writers + cpuset operations). And the two work items must meet in same
worker's processing hence probability is low (zero?) with less than
WQ_DFL_ACTIVE items.

Right, I'm on the same page. Should we document then somewhere that
the cgroup mutex can't be locked from a system wq context?

I think thus will also make the Fixes tag more meaningful.

I think that's completely fine. What's not fine is saturating system_wq.
Anything which creates a large number of concurrent work items should be
using its own workqueue. If anything, workqueue needs to add a warning for
saturation conditions and who are the offenders.

Thanks.


I will add a patch do document that.
Should we modify WQ_DFL_ACTIVE(256 now)? Maybe 1024 is acceptable?

Best regards,
Ridong