Re: [PATCH 2/2] kthread, cgroup: close race window where new kthreads can be migrated to non-root cgroups

From: Oleg Nesterov
Date: Thu Mar 16 2017 - 11:41:59 EST


On 03/16, Oleg Nesterov wrote:
>
> On 03/15, Tejun Heo wrote:
> >
> > --- a/kernel/cgroup/cgroup.c
> > +++ b/kernel/cgroup/cgroup.c
> > @@ -2425,11 +2425,13 @@ ssize_t __cgroup_procs_write(struct kern
> > tsk = tsk->group_leader;
> >
> > /*
> > - * Workqueue threads may acquire PF_NO_SETAFFINITY and become
> > - * trapped in a cpuset, or RT worker may be born in a cgroup
> > - * with no rt_runtime allocated. Just say no.
> > + * kthreads may acquire PF_NO_SETAFFINITY during initialization.
> > + * If userland migrates such kthread to a non-root cgroup, it can
> > + * become trapped in a cpuset, or RT kthread may be born in a
> > + * cgroup with no rt_runtime allocated. Just say no.
> > */
> > - if (tsk == kthreadd_task || (tsk->flags & PF_NO_SETAFFINITY)) {
> > + if (tsk == kthreadd_task || (tsk->flags & PF_NO_SETAFFINITY) ||
> > + ((tsk->flags & PF_KTHREAD) && !kthread_initialized(tsk))) {
> > ret = -EINVAL;
>
> ...
>
> > +bool kthread_initialized(struct task_struct *k)
> > +{
> > + struct kthread *kthread = to_kthread(k);
> > +
> > + return kthread && test_bit(KTHREAD_INITIALIZED, &kthread->flags);
> > +}
>
> Not sure I understand...
>
> With this patch you can no longer migrate a kernel thread created by
> kernel_thread() ? Note that to_kthread() is NULL unless it was created
> by kthread_create().

Either way, I am wondering if we can do something really trivial like
the patch below. This way we can also remove the "tsk == kthreadd_task"
check, and we do not need the barriers.

Oleg.

--- x/kernel/kthread.c
+++ x/kernel/kthread.c
@@ -226,6 +226,7 @@
ret = -EINTR;
if (!test_bit(KTHREAD_SHOULD_STOP, &self->flags)) {
__kthread_parkme(self);
+ current->flags &= ~PF_IDONTLIKECGROUPS;
ret = threadfn(data);
}
do_exit(ret);
@@ -537,7 +538,7 @@
set_cpus_allowed_ptr(tsk, cpu_all_mask);
set_mems_allowed(node_states[N_MEMORY]);

- current->flags |= PF_NOFREEZE;
+ current->flags |= (PF_NOFREEZE | PF_IDONTLIKECGROUPS);

for (;;) {
set_current_state(TASK_INTERRUPTIBLE);
--- x/kernel/cgroup/cgroup.c
+++ x/kernel/cgroup/cgroup.c
@@ -2429,7 +2429,7 @@
* trapped in a cpuset, or RT worker may be born in a cgroup
* with no rt_runtime allocated. Just say no.
*/
- if (tsk == kthreadd_task || (tsk->flags & PF_NO_SETAFFINITY)) {
+ if (tsk->flags & (PF_NO_SETAFFINITY | PF_IDONTLIKECGROUPS)) {
ret = -EINVAL;
goto out_unlock_rcu;
}