Re: [RFC/PATCH 0/4] CPUSET driven CPU isolation

From: David Rientjes
Date: Thu Feb 28 2008 - 20:12:09 EST


On Thu, 28 Feb 2008, Paul Jackson wrote:

> > Move the watchdog/0 thread to a cpuset that doesn't have access to cpu 0.
>
> I still don't understand ... you must have some context in mind that
> I've spaced out ... I can't even tell if that is a statement or a
> question.
>

You said that you weren't aware of any problems that could arise that are
fixed with this added check in set_cpus_allowed(), so I asked that you
setup a cpuset that doesn't have access to cpu 0 and move the watchdog/0
thread to it.

> Backing up, hopefully unwrapping, you seemed to allow moving bound tasks
> only to cpusets with the same cpus (how come you didn't check for the
> same memory nodes too?). If you really needed to move bound tasks at all,
> then that seemed like an unnecessarily tight constraint. It wouldn't hurt
> the bound task to move to another cpuset that still allowed the CPUs it was
> bound to.
>
> ... but after an another iteration of that subthread ... I'm wondering
> why you have to move bound tasks at all. How about PF_THREAD_BIND just
> meaning (1) "can't be moved to any other cpuset", and (2) "always
> placed in the top cpuset," so we don't have to worry about being unable
> to move threads out of child cpusets.
>
> Do you have any situation in which pinned threads have to be moved?
>
> I don't. Can we just prohibit it?
>

The fix is more general than just the cpusets case, even though it is
probably the only user in the kernel of set_cpus_allowed() that allows
bound kthreads to be moved.

The idea is to expressly deny kthreads that have called kthread_bind()
from changing their cpus_allowed via set_cpus_allowed(). Cpusets should
handle that gracefully, but my fix is more general: it adds the check to
set_cpus_allowed() instead of in the cpusets code.

This is a little difficult with the current way that cpusets calls
set_cpus_allowed() since its attach function is void and does not return
error or success to the cgroup interface, yet my patch prevents the soft
lockups and kernel crashes that can occur when cpusets changes the
cpus_allowed of watchdog or migration threads that is obviously in error.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/