Re: Aw: Re: [RFC PATCH urcu on mips, parisc] Fix: compat_futex should work-around futex signal-restart kernel bug

From: Helge Deller
Date: Sun Dec 20 2015 - 10:38:08 EST


On 20.12.2015 15:11, Mathieu Desnoyers wrote:
> ----- On Dec 19, 2015, at 5:37 AM, Helge Deller deller@xxxxxx wrote:
>
>> Hi Mathieu,
>>
>> On 18.12.2015 21:42, Helge Deller wrote:
>>> On 18.12.2015 20:58, Mathieu Desnoyers wrote:
>>>>>>> When testing liburcu on a 3.18 Linux kernel, 2-core MIPS (cpu model :
>>>>>>> Ingenic JZRISC V4.15 FPU V0.0), we notice that a blocked sys_futex
>>>>>>> FUTEX_WAIT returns -1, errno=ENOSYS when interrupted by a SA_RESTART
>>>>>>> signal handler. This spurious ENOSYS behavior causes hangs in liburcu
>>>>>>> 0.9.x. Running a MIPS 3.18 kernel under a QEMU emulator exhibits the
>>>>>>> same behavior. This might affect earlier kernels.
>>>>>>>
>>>>>>> This issue appears to be fixed in 3.18.y stable kernels and 3.19, but
>>>>>>> nevertheless, we should try to handle this kernel bug more gracefully
>>>>>>> than a user-space hang due to unexpected spurious ENOSYS return value.
>>>>>>
>>>>>> It's actually fixed in 3.19, but not in 3.18.y stable kernels. The
>>>>>> Linux kernel upstream fix commit is:
>>>>>> e967ef02 "MIPS: Fix restart of indirect syscalls"
>>
>>>> Looks like parisc has an issue very similar to the one that
>>>> has been fixed on MIPS by e967ef02 "MIPS: Fix restart of indirect syscalls".
>>
>> Yes, parisc is affected the same way.
>> I've posted a patch to the parisc mailing list which fixes this issue for
>> parisc and which I plan to push into stable kernels:
>> http://thread.gmane.org/gmane.linux.ports.parisc/26243
>>
>> Regarding your patch for liburcu:
>>
>>>>>>> Therefore, fallback on the "async-safe" version of compat_futex in those
>>>>>>> situations where FUTEX_WAIT returns ENOSYS. This async-safe fallback has
>>>>>>> the nice property of being OK to use concurrently with other FUTEX_WAKE
>>>>>>> and FUTEX_WAIT futex() calls, because it's simply a busy-wait scheme.
>>
>> I've tested your patch. It does not produce any regressions on parisc, but I
>> can't
>> say for sure if it really works. ENOSYS is returned randomly, so maybe I didn't
>> faced a situation where your patch actually was used.
>
> If you ran make check and make regtest, and nothing
> fails/hangs, you should be OK.

Yes, I did run both.

> liburcu runs very heavy
> stress-tests which makes it likely to hit race conditions
> repeatedly.

Helge

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/