Re: [PATCH] kthread: Fix race condition between kthread_parkme() and kthread_unpark()

From: Peter Zijlstra
Date: Wed Sep 27 2017 - 08:17:18 EST


On Fri, Apr 28, 2017 at 07:32:36PM -0700, Junaid Shahid wrote:
> In general, if kthread_unpark() and kthread_parkme() execute together,
> the kthread is supposed to be in an unparked state. This is because
> kthread_unpark() either wakes up the thread if it already got parked,
> or it cancels a prior kthread_park() call and hence renders the
> kthread_parkme() moot.
>
> However, if kthread_unpark() happens to execute between the time that
> kthread_parkme() checks the KTHREAD_SHOULD_STOP flag and sets the
> KTHREAD_IS_PARKED flag, then kthread_unpark() will not wake up the
> thread and it will remain in a parked state.
>
> This is fixed by making the checking of KTHREAD_SHOULD_STOP and the
> setting of KTHREAD_IS_PARKED atomic via a cmpxchg inside kthread_parkme.
>
> Signed-off-by: Junaid Shahid <junaids@xxxxxxxxxx>
> ---
> kernel/kthread.c | 16 ++++++++++++----
> 1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index 26db528c1d88..651f03baf62f 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -169,12 +169,20 @@ void *kthread_probe_data(struct task_struct *task)
>
> static void __kthread_parkme(struct kthread *self)
> {
> + ulong flags;

That should be: "unsigned long".

> +
> __set_current_state(TASK_PARKED);
> - while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) {
> - if (!test_and_set_bit(KTHREAD_IS_PARKED, &self->flags))
> - complete(&self->parked);
> - schedule();
> + flags = self->flags;

And at the very least, that should be:

flags = READ_ONCE(self->flags);

> +
> + while (test_bit(KTHREAD_SHOULD_PARK, &flags)) {

Otherwise there's very little that stops the compiler from doing a
reload here.

> + if (cmpxchg(&self->flags, flags,
> + flags | (1 << KTHREAD_IS_PARKED)) == flags) {
> + if (!test_bit(KTHREAD_IS_PARKED, &flags))
> + complete(&self->parked);
> + schedule();
> + }
> __set_current_state(TASK_PARKED);
> + flags = self->flags;
> }

That said, this is really quite horrible.

The alternative is of course changing kthread_unpark() to use a cmpxchg
to clear SHOULD_PARK and then check IS_PARKED on its return value, but
that's equally horrible.

How about something like the below? I still think its overly fragile,
but barring more invasive changes the below more or less does as you
suggest.

---

kernel/kthread.c | 40 ++++++++++++++++++++++++++++++++++------
1 file changed, 34 insertions(+), 6 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 1c19edf82427..f437f53f6fbc 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -169,14 +169,38 @@ void *kthread_probe_data(struct task_struct *task)

static void __kthread_parkme(struct kthread *self)
{
- __set_current_state(TASK_PARKED);
- while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) {
- if (!test_and_set_bit(KTHREAD_IS_PARKED, &self->flags))
- complete(&self->parked);
- schedule();
+ unsigned long flags;
+
+ for (;;) {
__set_current_state(TASK_PARKED);
+
+ /*
+ * CPU0 CPU1
+ *
+ * [S] ->state = PARKED [S] ->flags &= ~SHOULD_PARK
+ * [L] ->flags [RmW] ->flags &= ~IS_PARKED
+ * MB MB
+ * [RmW] ->flags |= IS_PARKED wakeup()
+ *
+ * Thus, either CPU0 sets IS_PARKED and CPU1's wakeup must observe PARKED
+ * or we observe !SHOULD_PARK and terminate.
+ */
+
+ flags = READ_ONCE(self->flags);
+ if (!test_bit(KTHREAD_SHOULD_PARK, &flags))
+ break;
+
+ /*
+ * Ensure SHOULD_PARK remains set while we set IS_PARKED.
+ * Thereby we ensure kthread_unpark() cannot clear SHOULD_PARK
+ * without us noticing.
+ */
+
+ if (try_cmpxchg(&self->flags, &flags, flags | (1UL << KTHREAD_IS_PARKED))) {
+ complete(&self->parked);
+ schedule();
+ }
}
- clear_bit(KTHREAD_IS_PARKED, &self->flags);
__set_current_state(TASK_RUNNING);
}

@@ -454,6 +478,10 @@ void kthread_unpark(struct task_struct *k)
/*
* Newly created kthread was parked when the CPU was offline.
* The binding was lost and we need to set it again.
+ *
+ * Since we've observed IS_PARKED, the other thread will (have)
+ * called schedule() and thus __kthread_bind()'s
+ * wait_task_inactive() will succeed.
*/
if (test_bit(KTHREAD_IS_PER_CPU, &kthread->flags))
__kthread_bind(k, kthread->cpu, TASK_PARKED);