Re: [RESEND][v2][PATCH] Fix a race between try_to_wake_up() and a woken up task

From: Balbir Singh
Date: Mon Sep 05 2016 - 04:37:16 EST


On Mon, Sep 5, 2016 at 5:48 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Mon, Sep 05, 2016 at 05:14:19PM +1000, Benjamin Herrenschmidt wrote:
>> On Mon, 2016-09-05 at 13:16 +1000, Balbir Singh wrote:
>>
>> .../...
>> >
>> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> > Cc: Nicholas Piggin <npiggin@xxxxxxxxx>
>>
>> Acked-by: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
>>
>> > Signed-off-by: Balbir Singh <bsingharora@xxxxxxxxx>
>> > ---
>> > kernel/sched/core.c | 11 +++++++++++
>> > 1 file changed, 11 insertions(+)
>> >
>> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> > index 2a906f2..582c684 100644
>> > --- a/kernel/sched/core.c
>> > +++ b/kernel/sched/core.c
>> > @@ -2016,6 +2016,17 @@ try_to_wake_up(struct task_struct *p, unsigned
>> > int state, int wake_flags)
>> > success = 1; /* we're going to change ->state */
>> > cpu = task_cpu(p);
>> >
>> > + /*
>> > + * Ensure we see on_rq and p_state consistently
>> > + *
>> > + * For example in __rwsem_down_write_failed(), we have
>> > + * [S] ->on_rq = 1 [L] ->state
>> > + * MB RMB
>> > + * [S] ->state = TASK_UNINTERRUPTIBLE [L] ->on_rq
>> > + * In the absence of the RMB p->on_rq can be observed to be 0
>> > + * and we end up spinning indefinitely in while (p->on_cpu)
>> > + */
>
> So I did replace that comment with the one I proposed earlier. I checked
> a fair number of architectures and many did not have an obvious barrier
> in switch_to(). So that is not something we can rely on, nor do we need
> to I think.
>

Thanks for the comment edit and thanks for letting us know.

Balbir Singh