Re: [PATCH linux-4.1.y] sched/wait: Fix signal handling in bit wait helpers

From: Florian Fainelli
Date: Tue Apr 18 2017 - 13:23:20 EST


On 04/18/2017 06:46 AM, Oleg Nesterov wrote:
> Just curious, any particular reason you want to backport this patch?
>
> because nobody understands why it actually helped.

While working with a customer we found this patch and it did solve the
original reporters' problem and matched the issue reported by Vladimir.

>
> and please note that this patch is buggy, fixed by dfd01f0260
> "sched/wait: Fix the signal handling fix"

Ah, thanks, I missed that one!

Since Peter's fix is on top of this change, how should we proceed with
the stable-4.1 kernel? Alex, do you still actively maintain linux-4.1.y?

>
> Oleg.
>
> On 04/12, Florian Fainelli wrote:
>>
>> From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>>
>> [ Upstream commit 68985633bccb6066bf1803e316fbc6c1f5b796d6 ]
>>
>> Vladimir reported getting RCU stall warnings and bisected it back to
>> commit:
>>
>> 743162013d40 ("sched: Remove proliferation of wait_on_bit() action functions")
>>
>> That commit inadvertently reversed the calls to schedule() and signal_pending(),
>> thereby not handling the case where the signal receives while we sleep.
>>
>> Reported-by: Vladimir Murzin <vladimir.murzin@xxxxxxx>
>> Tested-by: Vladimir Murzin <vladimir.murzin@xxxxxxx>
>> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
>> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>> Cc: Mike Galbraith <efault@xxxxxx>
>> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>> Cc: mark.rutland@xxxxxxx
>> Cc: neilb@xxxxxxx
>> Cc: oleg@xxxxxxxxxx
>> Fixes: 743162013d40 ("sched: Remove proliferation of wait_on_bit() action functions")
>> Fixes: cbbce8220949 ("SCHED: add some "wait..on_bit...timeout()" interfaces.")
>> Link: http://lkml.kernel.org/r/20151201130404.GL3816@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
>> ---
>> kernel/sched/wait.c | 16 ++++++++--------
>> 1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
>> index 852143a79f36..e0bb7e6c4fb0 100644
>> --- a/kernel/sched/wait.c
>> +++ b/kernel/sched/wait.c
>> @@ -583,18 +583,18 @@ EXPORT_SYMBOL(wake_up_atomic_t);
>>
>> __sched int bit_wait(struct wait_bit_key *word)
>> {
>> - if (signal_pending_state(current->state, current))
>> - return 1;
>> schedule();
>> + if (signal_pending(current))
>> + return -EINTR;
>> return 0;
>> }
>> EXPORT_SYMBOL(bit_wait);
>>
>> __sched int bit_wait_io(struct wait_bit_key *word)
>> {
>> - if (signal_pending_state(current->state, current))
>> - return 1;
>> io_schedule();
>> + if (signal_pending(current))
>> + return -EINTR;
>> return 0;
>> }
>> EXPORT_SYMBOL(bit_wait_io);
>> @@ -602,11 +602,11 @@ EXPORT_SYMBOL(bit_wait_io);
>> __sched int bit_wait_timeout(struct wait_bit_key *word)
>> {
>> unsigned long now = ACCESS_ONCE(jiffies);
>> - if (signal_pending_state(current->state, current))
>> - return 1;
>> if (time_after_eq(now, word->timeout))
>> return -EAGAIN;
>> schedule_timeout(word->timeout - now);
>> + if (signal_pending(current))
>> + return -EINTR;
>> return 0;
>> }
>> EXPORT_SYMBOL_GPL(bit_wait_timeout);
>> @@ -614,11 +614,11 @@ EXPORT_SYMBOL_GPL(bit_wait_timeout);
>> __sched int bit_wait_io_timeout(struct wait_bit_key *word)
>> {
>> unsigned long now = ACCESS_ONCE(jiffies);
>> - if (signal_pending_state(current->state, current))
>> - return 1;
>> if (time_after_eq(now, word->timeout))
>> return -EAGAIN;
>> io_schedule_timeout(word->timeout - now);
>> + if (signal_pending(current))
>> + return -EINTR;
>> return 0;
>> }
>> EXPORT_SYMBOL_GPL(bit_wait_io_timeout);
>> --
>> 2.12.1
>>
>


--
Florian