Re: [patch] cpusets: allow PF_THREAD_BOUND kworkers to escape froma cpuset

From: David Rientjes
Date: Fri Sep 23 2011 - 09:27:48 EST


On Fri, 23 Sep 2011, Mike Galbraith wrote:

> cpusets: allow PF_THREAD_BOUND kworkers to escape from a cpuset
>
> kworkers can be born in a cpuset, leaving them adrift on an unsinkable ship.
> Allow them to be moved to the root cpuset so the cpuset can be destroyed
>
> Signed-off-by: Mike Galbraith <efault@xxxxxx>
> Acked-by: Li Zefan <lizf@xxxxxxxxxxxxxx>

Did Li ack this version?

>
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 10131fd..3769f9e 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1384,7 +1384,7 @@ static int cpuset_can_attach(struct cgroup_subsys *ss, struct cgroup *cont,
> * set_cpus_allowed_ptr() on all attached tasks before cpus_allowed may
> * be changed.
> */
> - if (tsk->flags & PF_THREAD_BOUND)
> + if (tsk->flags & PF_THREAD_BOUND && cont != cont->top_cgroup)
> return -EINVAL;
>
> return 0;
> @@ -1426,9 +1426,14 @@ static void cpuset_attach_task(struct cgroup *cont, struct task_struct *tsk)
> /*
> * can_attach beforehand should guarantee that this doesn't fail.
> * TODO: have a better way to handle failure here
> + *
> + * Special case: bound kthreads born in a cpuset may be moved to
> + * the top level cpuset without attempting to diddle their mask.
> */
> - err = set_cpus_allowed_ptr(tsk, cpus_attach);
> - WARN_ON_ONCE(err);
> + if (!(tsk->flags & PF_THREAD_BOUND && cont == cont->top_cgroup)) {
> + err = set_cpus_allowed_ptr(tsk, cpus_attach);
> + WARN_ON_ONCE(err);
> + }
>
> cpuset_change_task_nodemask(tsk, &cpuset_attach_nodemask_to);
> cpuset_update_task_spread_flag(cs, tsk);

This doesn't make any sense, the user can now change the cpumask of the
kworker with cpusets but not with sched_setaffinity().

PF_THREAD_BOUND is set specifically so threads cannot move from the cpu
that they are bound to, that's why the cpuset code and sched_setaffinity()
reject such a configuration. So the problem isn't in the cpuset code or
scheduler at all, you would need to deal with this in the kworker code by
either not setting PF_THREAD_BOUND (which, according to the comment, Tejun
thinks is pretty important) or manage the worker threads in a way such
that the new cpumask (all cpus, since it's in the root cpuset) actually
make sense for that kworker. The cpuset code won't care if the kworker
has a cpumask of all online cpus.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/