On Thu, 29 Nov 2018, Waiman Long wrote:
That can be costly for x86 which will now have 2 locked instructions.
Yeah, and when used as an actual queue we should really start to notice.
Some users just have a single task in the wake_q because avoiding the cost
of wake_up_process() with locks held is significant.
How about instead of adding the barrier before the cmpxchg, we do it
in the failed branch, right before we return. This is the uncommon
path.
Thanks,
Davidlohr
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 091e089063be..0d844a18a9dc 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -408,8 +408,14 @@ void wake_q_add(struct wake_q_head *head, struct task_struct *task)
* This cmpxchg() executes a full barrier, which pairs with the full
* barrier executed by the wakeup in wake_up_q().
*/
- if (cmpxchg(&node->next, NULL, WAKE_Q_TAIL))
+ if (cmpxchg(&node->next, NULL, WAKE_Q_TAIL)) {
+ /*
+ * Ensure, that when the cmpxchg() fails, the corresponding
+ * wake_up_q() will observe our prior state.
+ */
+ smp_mb__after_atomic();
return;
+ }
get_task_struct(task);