Re: [RFC PATCH] sched/wait: Make interruptible exclusive waitqueue wakeups reliable

From: Oleg Nesterov
Date: Tue Dec 10 2019 - 12:30:20 EST


On 12/10, Ingo Molnar wrote:
>
> --- a/kernel/sched/wait.c
> +++ b/kernel/sched/wait.c
> @@ -290,6 +290,11 @@ long prepare_to_wait_event(struct wait_queue_head *wq_head, struct wait_queue_en
> * But we need to ensure that set-condition + wakeup after that
> * can't see us, it should wake up another exclusive waiter if
> * we fail.
> + *
> + * In other words, if an exclusive waiter got here, then the
> + * waitqueue condition is and stays true and we are guaranteed
> + * to exit the waitqueue loop and will ignore the -ERESTARTSYS
> + * and return success.
> */
> list_del_init(&wq_entry->entry);
> ret = -ERESTARTSYS;

Agreed, this makes it more clear... but at the same time technically this is
not 100% correct, or perhaps I misread this comment.

We are not guaranteed to return success even if condition == T and we were
woken up as an exclusive waiter, another waiter can consume the condition.
But this is fine. Say,

long LOCK;
wait_queue_head WQ;

int lock()
{
return wait_event_interruptible_exclusive(&WQ, xchg(&LOCK, 1) == 0);
}

void unlock()
{
xchg(&LOCK, 0);
wake_up(&WQ, TASK_NORMAL);
}

A woken exclusive waiter can return -ERESTARTSYS if it races with another
lock(), or it races with another sleeping waiter woken up by the signal,
this is fine.

So may be

* In other words, if an exclusive waiter got here and the
* waitqueue condition is and stays true, then we are guaranteed
* to exit the waitqueue loop and will ignore the -ERESTARTSYS
* and return success.

is more accurate?

Oleg.