Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup
From: Oleg Nesterov
Date: Wed Jun 06 2018 - 09:51:23 EST
On 06/05, Peter Zijlstra wrote:
>
> Also, I think we still need TASK_PARKED as a special state for that.
I think it would be nice to kill the TASK_PARKED state altogether. But I don't
know how. I'll try to look at this code later, but I am not sure I will find a
way to cleanup it...
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -177,12 +177,24 @@ void *kthread_probe_data(struct task_struct *task)
> static void __kthread_parkme(struct kthread *self)
> {
> for (;;) {
> - set_current_state(TASK_PARKED);
> + /*
> + * TASK_PARKED is a special state; we must serialize against
> + * possible pending wakeups to avoid store-store collisions on
> + * task->state.
> + *
> + * Such a collision might possibly result in the task state
> + * changin from TASK_PARKED and us failing the
> + * wait_task_inactive() in kthread_park().
> + */
> + set_special_state(TASK_PARKED);
Agreed,
> if (!test_bit(KTHREAD_SHOULD_PARK, &self->flags))
> break;
> +
> + complete_all(&self->parked);
> schedule();
> }
> __set_current_state(TASK_RUNNING);
> + reinit_completion(&self->parked);
But how can we know that all the callers of kthread_park() have already returned
from wait_for_completion() ?
Oh. The very fact that __kthread_parkme() does complete_all() proves that we need
some serious cleanups. In particular, I think that kthread_park() on a parked kthread
must not be possible.
Just look at this code. It looks as if __kthread_parkme() can race with _unpark()
and thus we need this wait-event-like loop.
But if it can race with _unpark() then kthread_park() can block forever.
For the start, can't we change kthread_park()
- set_bit(KTHREAD_SHOULD_PARK, &kthread->flags);
+ if (test_and_set_bit(...))
+ return -EAGAIN;
and s/complete_all/complete/ in __kthread_parkme() ?
IIUC, this will only affect smpboot_update_cpumask_percpu_thread() which can hit
an already parked thread, but it doesn't need to wait.
And it seems that smpboot_update_cpumask_percpu_thread() in turn needs some cleanups.
Hmm. and its single user: kernel/watchdog.c.
And speaking of watchdog.c, can't we simply kill the "watchdog/%u" threads? This is
off-topic, but can't watchdog_timer_fn() use stop_one_cpu_nowait(watchdog) ?
And I really think we should unexport kthread_park/unpark(), only smpboot_thread_fn()
should use them. kthread() should not play with __kthread_parkme(). And even
KTHREAD_SHOULD_PARK must die, I mean it should live in struct smp_hotplug_thread,
not in struct kthread.
OK, this is off-topic too.
In short, I think this patch is fine but I didn't read it carefully, will try tomorrow.
And, let me repeat, can't we avoid complete_all() ?
Oleg.