Re: [PATCH] Futex Asynchronous Interface

From: Peter Wächtler (pwaechtler@loewe-komp.de)
Date: Wed Jun 12 2002 - 12:07:44 EST


Linus Torvalds wrote:
>
>
> On Wed, 12 Jun 2002, Peter Wächtler wrote:
>
>>For the uncontended case: their is no blocked process...
>>
>
> Wrong.
>
> The process that holds the lock can die _before_ it gets contended.
>
> When another thread comes in, it now is contended, but the kernel doesn't
> know about anything.
>
>
>>One (or more) process is blocked in a waitqueue in the kernel - waiting
>>for a futex to be released.
>>
>>The lock holder crashes - say with SIGSEGV.
>>
>
> The lock holder may have crashed long before the waiting process even
> started waiting.
>
> Besides, the kernel only knows about those processes that see contention.
> Even if the contention happened _before_ the lock holder crashed, the
> kernel doesn't know about the lock holder itself - it only knows about the
> process that caused the contention. The kernel will get to know about the
> lock holder only when it tris to resolve the contention, and since that
> one has crashed, that will never happen.
>
>
>>I know that the kernel can't do anything about the aborted critical section.
>>But the waiters should be "freed" - and now we can discuss if we kill them
>>or report an error and let them deal with that.
>>
>
> The waiters should absolutely _not_ be freed. There's nothign they can do
> about it. The data inside the critical region is no longer valid, and
>
>
>>Can't be done? I don't think that this would add a performance hit
>>since it's only done on exit (and especially "abnormal" exit).
>>
>
> But the point is not that it would be a performance hit on "exit()", but
> that WE DON'T TRACK THE LOCKS in the kernel in the first place.
>
> Right now the kernel does _zero_ work for a lock that isn't contended. It
> doesn't know _anything_ about the process that got the lock initially.
>
> Any amount of tracking would be _extremely_ expensive. Right now getting
> an uncontended lock is about 15 CPU cycles in user space.
>
> Tryin to tell the kernel about gettign that lock takes about 1us on a P4
> (system call overhead), ie we're talking 18000 cycles. 18 THOUSAND cycles
> minimum. Compared to the current 15 cycles. That's more than three orders
> of magnitude slower than the current code, and you just lost the whole
> point of doing this all in user space in the first place.
>

Thanks for this patient explanation. I see the problem now clearly.

To Frank: I will read the (already downloaded) paper ;-)

And to all: Did you notice the "nutex" approach
http://marc.theaimsgroup.com/?l=linux-kernel&m=102373047428621&w=2

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jun 15 2002 - 22:00:25 EST