Re: [PATCH v3] wait: prevent waiter starvation in__wait_on_bit_lock

From: Oleg Nesterov
Date: Wed Jan 21 2009 - 09:41:01 EST


On 01/20, Johannes Weiner wrote:
>
> > But, more importantly, I'm afraid we can also have the false negative,
> > this "if (!test_bit())" test lacks the barriers. This can't happen with
> > sync_page_killable() because it always calls schedule(). But let's
> > suppose we modify it to check signal_pending() first:
> >
> > static int sync_page_killable(void *word)
> > {
> > if (fatal_signal_pending(current))
> > return -EINTR;
> > return sync_page(word);
> > }
> >
> > It is still correct, but unless I missed something now __wait_on_bit_lock()
> > has problems again.
>
> Hm, this would require the lock bit to be set without someone else
> doing the wakeup. How could this happen?
>
> I could think of wake_up_page() happening BEFORE clear_bit_unlock()
> and we have to be on the front of the waitqueue. Then we are already
> running, the wake up is a nop, the !test_bit() is false and noone
> wakes up the next real contender.
>
> But the wake up side uses a smp barrier after clearing the bit, so if
> the bit is not cleared we can expect a wake up, no?

Yes we have the barriers on the "wakeup", but this doesn't mean the
woken task must see the result of clear_bit() (unless it was really
unscheduled of course).

> Or do we still need a read-side barrier before the test bit?

Even this can't help afaics.

Because the the whole clear_bit + wakeup sequence can happen after
the "if (!test_bit()) check and before finish_wait(). Please note
that from the waker's pov we are sleeping in TASK_KILLABLE state,
it will wake up us if we are at the front of the waitqueue.

(to clarify, I am talking about the imaginary sync_page_killable()
above).

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/