Re: wait_on_page_bit_common(TASK_KILLABLE, EXCLUSIVE) can miss wakeup?

From: Oleg Nesterov
Date: Tue Jun 30 2020 - 02:17:28 EST


On 06/30, Nicholas Piggin wrote:
> Excerpts from Oleg Nesterov's message of June 30, 2020 12:02 am:
> > On 06/29, Nicholas Piggin wrote:
> >>
> >> prepare_to_wait_event() has a pretty good pattern (and comment), I would
> >> favour using that (test the signal when inserting on the waitqueue).
> >>
> >> @@ -1133,6 +1133,15 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
> >> for (;;) {
> >> spin_lock_irq(&q->lock);
> >>
> >> + if (signal_pending_state(state, current)) {
> >> + /* Must not lose an exclusive wake up, see
> >> + * prepare_to_wait_event comment */
> >> + list_del_init(&wait->entry);
> >> + spin_unlock_irq(&q->lock);
> >> + ret = -EINTR;
> >
> > Basically this is what my patch in the 1st email does. But note that we can't
> > just set "ret = -EINTR" here, we will need to clear "ret" if test_and_set_bit()
> > below succeeds. That is why I used another "int intr" variable.
>
> You snipped off one more important line of context. No such games are
> required AFAIKS.

for (;;) {
spin_lock_irq(&q->lock);

+ if (signal_pending_state(state, current)) {
+ /* Must not lose an exclusive wake up, see
+ * prepare_to_wait_event comment */
+ list_del_init(&wait->entry);
+ spin_unlock_irq(&q->lock);
+ ret = -EINTR;
+ break;
+ }


so wait_on_page_bit_common() just returns -EINTR if signal_pending_state() == T.
And this is wrong if "current" was already woken up by unlock_page().

That is why ___wait_event() checks the condition even if prepare_to_wait_event()
returns -EINTR. The comment in prepare_to_wait_event() tries to explain this.

Oleg.