Re: [RFC PATCH] ipc/mqueue: add detector of wakeup race

From: Manfred Spraul
Date: Fri May 14 2021 - 11:38:37 EST


Hi Hillf,

On 5/11/21 9:36 AM, Hillf Danton wrote:
Apart from the pipeline wakeup, there are wakeups from timer timeout, signal,
and spurious sources. Because the race between pipeline and non pipeline
wakeups is not frequent, we handle it under lock on the waiter side.

On the waker side, wait condition is set true before adding waiter to the wake
queue to cut the chance for missing wakup, and it is updated to be READY after
wake_q_add to show that the race window is closed by the waker and it is safe
now for waiter to go home in bid to cut the risk that waiter exits while waker
is dereferencing waiter's resouces.

Based on the initial work from Varad Gautam.

Signed-off-by: Hillf Danton <hdanton@xxxxxxxx>
---

--- y/ipc/mqueue.c
+++ x/ipc/mqueue.c
@@ -55,7 +55,8 @@ struct mqueue_fs_context {
#define RECV 1
#define STATE_NONE 0
-#define STATE_READY 1
+#define STATE_TRUE 1
+#define STATE_READY 2 /* detector of wakeup race */
struct posix_msg_tree_node {
struct rb_node rb_node;
@@ -722,7 +723,7 @@ static int wq_sleep(struct mqueue_inode_
spin_lock(&info->lock);
/* we hold info->lock, so no memory barrier required */
- if (READ_ONCE(ewp->state) == STATE_READY) {
+ if (READ_ONCE(ewp->state) != STATE_NONE) {
retval = 0;
goto out_unlock;
}
@@ -1005,11 +1006,11 @@ static inline void __pipelined_op(struct
struct ext_wait_queue *this)
{
list_del(&this->list);
- get_task_struct(this->task);
+ this->state = STATE_TRUE;
+ wake_q_add(wake_q, this->task);
/* see MQ_BARRIER for purpose/pairing */
smp_store_release(&this->state, STATE_READY);
- wake_q_add_safe(wake_q, this->task);
}
/* pipelined_send() - send a message directly to the task waiting in

As I have written in the other mail:

Copying this->task into a local variable is far simpler.

(before we had wake_q, an approach with 3 states was used, see e.g.: https://elixir.bootlin.com/linux/v2.6.39.4/source/ipc/msg.c#L858 : -EAGAIN, NULL, <valid pointer>. But due to the wake_q framework, we do not need that anymore)

--

    Manfred