Re: futex(2) man page update help request

From: Michael Kerrisk (man-pages)
Date: Fri Jan 16 2015 - 10:17:23 EST


Hello Thomas,

On 01/15/2015 11:23 PM, Thomas Gleixner wrote:
> On Thu, 15 Jan 2015, Michael Kerrisk (man-pages) wrote:
>>> [EINVAL] uaddr equal uaddr2. Requeue to same futex.
>>
>> ??? I added this, but does this error not occur only for PI requeues?
>
> It's equally wrong for normal futexes. And its actually the same code
> checking for this for all variants.

I don't understand "equally wrong" in your reply, I'm sorry. Do you
mean:

a) This error text should be there for both normal and PI requeues
OR
a) This error text should be there for neither normal nor PI requeues

>>> [EDEADLOCK] The futex is already locked by the caller or the kernel
>>> detected a deadlock scenario in a nested lock chain
>>
>> Added.
>
> It's actually EDEADLK

Yes, sorry -- I should have said that I already found and fixed
that problem.

>>> [EOWNERDIED] The owner of the futex died and the kernel made the
>>> caller the new owner. The kernel sets the FUTEX_OWNER_DIED bit in the
>>> futex userspace value. Caller is responsible for cleanup
>>
>> There is no such thing as an EOWNERDIED error. I had a look
>> through the kernel source for the FUTEX_OWNER_DIED cases and didn't
>> see an obvious error associated with them. Can you clarify? (I think
>> the point is that this condition, which is described in
>> Documentation/robust-futexes.txt, is not an error as such. However, I'm
>> not yet sure of how to describe it in the man page.)
>> I will add this point as a FIXME in the new draft man page.
>
> Oops. My bad. That's not the what the kernel does. The kernel merily
> marks it in the futex itself with FUTEX_OWNER_DIED. User space needs
> to deal with that and the posix users return EOWNERDEAD (not
> EOWNERDIED], so it's not part of the futex call itself.
>
> We had discussions about returning EOWNERDEAD in that case, but then
> glibc with its sophisticated error handling prevented that ....

Okay. I'll add a FIXME to the draft page, to see if we get some good
text together to describe FUTEX_OWNER_DIED and how it is used.

>>> FUTEX_TRYLOCK_PI
>>>
>>> This operation tries to acquire the futex at uaddr. It deals with the
>>> situation where the TID value at uaddr is 0, but the FUTEX_HAS_WAITER
>>> bit is set. User space cannot handle this race free.
>>
>> Added.
>>
>>> The arguments uaddr2, val, timeout and val3 are ignored.
>>
>> ??? But the code reads:
>>
>> case FUTEX_TRYLOCK_PI:
>> return futex_lock_pi(uaddr, flags, 0, timeout, 1);
>>
>> which momentarily misleads one into thinking that 'timeout' is used.
>> And: it's not quite ignored, since in futex_lock_pi() a non-NULL
>> 'timeout' is unconditionally dereferenced (meaning you could get
>> an EFAULT error for a bad 'timeout' pointer).
>> I'm confused....
>
> Indeed. That's just wrong.
>
>> Maybe the above code should be
>>
>> case FUTEX_TRYLOCK_PI:
>> return futex_lock_pi(uaddr, flags, 0, NULL, 1);
>> ?
>
> Care to send a patch?

Will do.

[...]

>> ??? I don't believe this can happen. 'val3' is internally set to
>> FUTEX_BITSET_MATCH_ANY. Can you confirm?
>
> Right. We dont support that bitset stuff in requeue_pi ATM.

Thanks for the confirmation.

Cheers,

Michael



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/