On Mon, Nov 15, 2004 at 01:22:18PM +0000, Jamie Lokier wrote:
1. A lost wakeup.
wait_A is woken, but wait_B is not, even though the second
pthread_cond_signal is "observably" after wait_B.
The operation order is observable in sense that wait_B could
update the data structure which is protected by cond+mutex, and
wake_Y could read that update prior to deciding to signal.
_Logically_, what happens is that wait_A is woken by wake_X, but
it does not immediately re-acquire the mutex. In this time
window, wait_B and wake_Y both run, and then wait_A acquires the
mutex. During this window, wait_A is able to absorb the second
signal.
It's not clear to me if POSIX requires wait_B to be signalled or
not in this case.
2. Future lost wakeups.
Future calls to pthread_cond_signal(cond) fail to wake wait_B,
even much later, because cond's NPTL data structure is
inconsistent. It's invariant is broken.
This is a bug in NPTL and it's easy to fix. Never increment wake
unconditionally. Instead, increment it conditionally when (a)
FUTEX_WAKE returns 1, and also (b) when FUTEX_WAIT returns -EAGAIN.
If you think it is fixable in userland, please write at least the pseudo
code that you believe should work. We have spent quite a lot of time
on that code and don't believe this is solvable in userland.
E.g. the futex IMHO must be incremented before FUTEX_WAKE, as otherwise
the woken tasks wouldn't see the effect.
I believe the only place this is solvable in is the kernel, by ensuring
atomicity (i.e. queuing task iff curval == expected_val operation atomic
wrt. futex_wake/futex_requeue in other tasks).