Re: [PATCH cgroup/for-4.5-fixes] cpuset: make mm migration asynchronous

From: Tejun Heo
Date: Fri Jan 22 2016 - 10:23:19 EST


On Tue, Jan 19, 2016 at 12:18:41PM -0500, Tejun Heo wrote:
> If "cpuset.memory_migrate" is set, when a process is moved from one
> cpuset to another with a different memory node mask, pages in used by
> the process are migrated to the new set of nodes. This was performed
> synchronously in the ->attach() callback, which is synchronized
> against process management. Recently, the synchronization was changed
> from per-process rwsem to global percpu rwsem for simplicity and
> optimization.
>
> Combined with the synchronous mm migration, this led to deadlocks
> because mm migration could schedule a work item which may in turn try
> to create a new worker blocking on the process management lock held
> from cgroup process migration path.
>
> This heavy an operation shouldn't be performed synchronously from that
> deep inside cgroup migration in the first place. This patch punts the
> actual migration to an ordered workqueue and updates cgroup process
> migration and cpuset config update paths to flush the workqueue after
> all locks are released. This way, the operations still seem
> synchronous to userland without entangling mm migration with process
> management synchronization. CPU hotplug can also invoke mm migration
> but there's no reason for it to wait for mm migrations and thus
> doesn't synchronize against their completions.
>
> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
> Reported-and-tested-by: Christian Borntraeger <borntraeger@xxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx # v4.4+

Applied to cgroup/for-4.5-fixes.

Thanks.

--
tejun