Re: Change in functionality of futex() system call.
From: David Oliver
Date: Tue Jun 07 2011 - 16:04:35 EST
ïOn Tue, Jun 7, 2011 at 2:53 PM, Andrew Lutomirski <luto@xxxxxxx> wrote:
> On Tue, Jun 7, 2011 at 3:33 PM, David Oliver <david@xxxxxxxxxxxxxxx> wrote:
>> On Tue, Jun 7, 2011 at 2:19 PM, Andrew Lutomirski <luto@xxxxxxx> wrote:
>>> On Tue, Jun 7, 2011 at 3:10 PM, David Oliver <david@xxxxxxxxxxxxxxx> wrote:
>>>> On Tue, Jun 7, 2011 at 1:43 PM, Andrew Lutomirski <luto@xxxxxxx> wrote:
>>>>> On Tue, Jun 7, 2011 at 11:58 AM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
>>>>>> Le mardi 07 juin 2011 Ã 10:44 -0400, Andy Lutomirski a Ãcrit :
>>>>>>> On 06/06/2011 11:13 PM, Darren Hart wrote:
>>>>>>> >
>>>>>>> >
>>>>>>> > On 06/06/2011 11:11 AM, Eric Dumazet wrote:
>>>>>>> >> Le lundi 06 juin 2011 Ã 10:53 -0700, Darren Hart a Ãcrit :
>>>>>>> >>>
>>>>>>> >>
>>>>>>> >>> If I understand the problem correctly, RO private mapping really doesn't
>>>>>>> >>> make any sense and we should probably explicitly not support it, while
>>>>>>> >>> special casing the RO shared mapping in support of David's scenario.
>>>>>>> >>>
>>>>>>> >>
>>>>>>> >> We supported them in 2.6.18 kernels, apparently. This might sounds
>>>>>>> >> stupid but who knows ?
>>>>>>> >
>>>>>>> >
>>>>>>> > I guess this is actually the key point we need to agree on to provide a
>>>>>>> > solution. This particular case "worked" in 2.6.18 kernels, but that
>>>>>>> > doesn't necessarily mean it was supported, or even intentional.
>>>>>>> >
>>>>>>> > It sounds to me that we agree that we should support RO shared mappings.
>>>>>>> > The question remains about whether we should introduce deliberate
>>>>>>> > support of RO private mappings, and if so, if the forced COW approach is
>>>>>>> > appropriate or not.
>>>>>>> >
>>>>>>>
>>>>>>> I disagree.
>>>>>>>
>>>>>>> FUTEX_WAIT has side-effects. ÂSpecifically, it eats one wakeup sent by
>>>>>>> FUTEX_WAKE. ÂSo if something uses futexes on a file mapping, then a
>>>>>>> process with only read access could (if the semantics were changed) DoS
>>>>>>> the other processes by spawning a bunch of threads and FUTEX_WAITing
>>>>>>> from each of them.
>>>>>>>
>>>>>>> If there were a FUTEX_WAIT_NOCONSUME that did not consume a wakeup and
>>>>>>> worked on RO mappings, I would drop my objection.
>>>>>>
>>>>>> If a group of cooperating processes uses a memory segment to exchange
>>>>>> critical information, do you really think this memory segment will be
>>>>>> readable by other unrelated processes on the machine ?
>>>>>
>>>>> Depends on the design.
>>>>>
>>>>> I have some software I'm working on that uses shared files and could
>>>>> easily use futexes.
>>>>>
>>>> I have software which currently uses shared files for a one way
>>>> transfer of information, which is modeled precisely by the futex (as
>>>> contrasted to the mutex) model. In this case, the number of receivers
>>>> is undetermined, so the number of wakeups is set to maxint.
>>>>
>>>> The receivers are minimally trusted: they have read access to the
>>>> files, so they cannot accidentally affect other processes use of the
>>>> data. Requiring my files to be writeable by all clients would require
>>>> a serious increase in the amount of software needing to be trusted.
>>>
>>> What's wrong with adding a FUTEX_WAIT_NOCONSUME flag then? ÂYour
>>> program can use it to get exactly the semantics it wants and my
>>> program can use it or not depending on which semantics it wants.
>>>
>> 1. I would prefer not to require my programs have to check for kernel
>> version (code named "working", "regressed", and "altered") to decide
>> which parameters need to be sent to the futex call.
>
> You don't have to check for kernel version. ÂJust try
> FUTEX_WAIT_NOCONSUME first and retry with FUTEX_WAIT if it returns
> -EINVAL.
>
... and punt if that gives me an EFAULT. Possible but clumsy.
Fortunately, I'm not writing code for general consumption.
> I think you've already lost on regressed kernels regardless :-/
>
>> 2. Doing FUTEX_WAIT_NOCONSUME would change the semantics of
>> futex_wake() between the "working" and "altered" kernels, as it would
>> no longer return the number of processes woken.
>
> True, but that change couldn't affect old code because old code
> wouldn't use FUTEX_WAIT_NOCONSUME.
>
So, how would I find out the number of processes awakened by the
futex_wake() - I only care for statistical purposes.
>>
>> It seems that FUTEX_WAIT_NOCONSUME would be rather like a
>> non-consuming read on a pipe.
>
> More like a nonconsuming read on an eventfd, which sounds very useful.
> Â(Actually, I'm porting code from Windows to Linux right now that
> wants that feature...)
>
> The reason I bring this up now is that I've been annoyed that
> FUTEX_WAIT can be used on an R/O mapping to interfere with futexes in
> that mapping. ÂUnder the original semantics this would have been
> pretty much impossible to fix, but the regression has been there for
> long enough that we have the option right now to fix it better instead
> of restoring the original behavior.
>
Not being a kernel developer, the change seems very recent - about
when I started finding my code failing with EFAULTs.
>From my perspective, that's a real case of my futexes being interfered with :).
>
> --Andy
>
--
Cheers!
David.
---------------------------------------------------------------
This email, along with any attachments, is confidential. If you
believe you received this message in error, please contact the
sender immediately and delete all copies of the message.
Thank you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/