Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups
From: Michal Hocko
Date: Wed Apr 13 2016 - 15:48:32 EST
On Wed 13-04-16 15:37:34, Tejun Heo wrote:
> Hello, Michal.
>
> On Wed, Apr 13, 2016 at 09:23:14PM +0200, Michal Hocko wrote:
> > I think we can live without lru_add_drain_all() in the migration path.
> > We are talking about 4 pagevecs so 56 pages. The charge migration is
>
> Ah, nice.
>
> > racy anyway. What concerns me more is how all this is fragile. It sounds
> > just too easy to add a dependency on per-cpu sync work later and
> > reintroduce this issue which is quite hard to detect.
> > Cannot we come up with something more robust? Or at least warn when we
> > try to use per-cpu workers with problematic locks held?
>
> Yeah, workqueue does limited lockdep annotation but it doesn't
> integrate fully with the rest of dependency tracking. It'd be nice to
> have that. Don't know how to tho.
I was thinking about something like flush_per_cpu_work() which would
assert on group_threadgroup_rwsem held for write. But this still sounds
suboptimal to me. Do we really have to do
cgroup_threadgroup_change_begin for kworker threads? They are
PF_NO_SETAFFINITY so they cannot be moved AFAIR. Or am I missing
something?
--
Michal Hocko
SUSE Labs