Re: [PATCH] kernel/futex.c: notice the return value after rt_mutex_finish_proxy_lock()fails

From: Chen Gang
Date: Thu Sep 12 2013 - 21:29:13 EST


On 09/13/2013 06:37 AM, Darren Hart wrote:
> On Thu, 2013-09-12 at 16:32 +0200, Thomas Gleixner wrote:
>> On Tue, 20 Aug 2013, Chen Gang wrote:
>>
>>> rt_mutex_finish_proxy_lock() can return failure code (e.g. -EINTR,
>>> -ETIMEDOUT).
>>>
>>> Original implementation has already noticed about it, but not check it
>>> before next work.
>>>
>>> Also let coments within 80 columns to pass "./scripts/checkpatch.pl".
>>>
>>>
>>> Signed-off-by: Chen Gang <gang.chen@xxxxxxxxxxx>
>>> ---
>>> kernel/futex.c | 30 ++++++++++++++++--------------
>>> 1 files changed, 16 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/kernel/futex.c b/kernel/futex.c
>>> index c3a1a55..1a94e7d 100644
>>> --- a/kernel/futex.c
>>> +++ b/kernel/futex.c
>>> @@ -2373,21 +2373,23 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
>>> ret = rt_mutex_finish_proxy_lock(pi_mutex, to, &rt_waiter, 1);
>>> debug_rt_mutex_free_waiter(&rt_waiter);
>>>
>>> - spin_lock(q.lock_ptr);
>>> - /*
>>> - * Fixup the pi_state owner and possibly acquire the lock if we
>>> - * haven't already.
>>> - */
>>> - res = fixup_owner(uaddr2, &q, !ret);
>>> - /*
>>> - * If fixup_owner() returned an error, proprogate that. If it
>>> - * acquired the lock, clear -ETIMEDOUT or -EINTR.
>>> - */
>>> - if (res)
>>> - ret = (res < 0) ? res : 0;
>>> + if (!ret) {
>>
>> Again. This is completely wrong!
>>
>> We MUST call fixup_owner even if finish_proxy_lock() returned with an
>> error code. Simply because finish_proxy_lock() is called outside of
>> the spin_lock(q.lock_ptr) region and another thread might have
>> modified the futex state. So we need to handle the corner cases
>> otherwise we might leave the futex in some undefined state.
>>
>> You're reintroducing a hard to decode bug, which got analyzed and
>> fixed in futex_lock_pi() years ago. See the history for the
>> explanation.
>>
>> Sigh.
>>
>> tglx
>
> Chen, perhaps you can let us know what the failure scenario is that you
> are trying to address with this patch. I only replied the once as I
> pointed out the corner-case and expected you to follow up with that.
> This region of code is very fragile to modifications as it has become
> more corner-cases than core logic in some places :-)
>

Oh, thanks, it is my fault:

the 'ret' which return from rt_mutex_finish_proxy_lock(), is used by
the next fixup_owner().

Hmm... excuse me, my English is not quite well, it seems you already
know about it, but not say straightly and directly?

next, when find/feel something wrong, can say directly, I can/should
understand it (and I need/should thank you, too), that will be more
efficient (can save both of us time resources).

:-)

> For starters, I'm not following your second sentence in the commit log.
> Can you elaborate on the following?
>
> "Original implementation has already noticed about it, but not check it
> before next work."
>
> Do you have a test-case that demonstrates a failure mode?
>

No, I just 'found' it, and give a simply 'fix' to let related experts
check (and now, we know it is just a spam).

Hmm... for 'test', it is really an 'important thing' to me (not 'urgent
thing'), I have plan to start to use LTP (Linux Test Project) in q4 of
2013 (start at 2013-10-01).


Thanks.
--
Chen Gang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/