[PATCH] osq_lock: avoid live-lock issue for RT task

From: Prateek Sood
Date: Fri Jul 14 2017 - 09:49:43 EST


Live Lock due to task spinning while unqueue of CPU osq_node
from optimistic_spin_queue. Task T1 had decremented mutex count to
acquire the lock on CPU0. Before setting owner it got preempted. On
CPU1 task T2 acquired osq_lock and started spinning on owner of mutex
with preemption disabled. CPU1 runq has one task, so need_resched will
not be set. On CPU0 task T3 tried to acquire osq_lock to spin on the
same mutex. At this time following scenario causes soft lockup:

After preemption of task T1, RT task T3 tried to acquire the same
mutex. It will start spinning on the osq_lock until the lock is available
or need_resched is set. For RT task, need_resched will not be set. Task T3
will not be able to bail out of the infinite loop.

Signed-off-by: Prateek Sood <prsood@xxxxxxxxxxxxxx>
---
kernel/locking/osq_lock.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 05a3785..99b8d99 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -1,6 +1,7 @@
#include <linux/percpu.h>
#include <linux/sched.h>
#include <linux/osq_lock.h>
+#include <linux/sched/rt.h>

/*
* An MCS like lock especially tailored for optimistic spinning for sleeping
@@ -85,6 +86,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
{
struct optimistic_spin_node *node = this_cpu_ptr(&osq_node);
struct optimistic_spin_node *prev, *next;
+ struct task_struct *task = current;
int curr = encode_cpu(smp_processor_id());
int old;

@@ -118,8 +120,13 @@ bool osq_lock(struct optimistic_spin_queue *lock)
while (!READ_ONCE(node->locked)) {
/*
* If we need to reschedule bail... so we can block.
+ * If a task spins on owner on a CPU after acquiring
+ * osq_lock while a RT task spins on another CPU to
+ * acquire osq_lock, it will starve the owner from
+ * completing if owner is to be scheduled on the same CPU.
+ * It will be a live lock.
*/
- if (need_resched())
+ if (need_resched() || rt_task(task))
goto unqueue;

cpu_relax_lowlatency();
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.,
is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.