Re: [RFC][PATCH 1/4] locking/mutex: Use try_cmpxchg()

From: Xu, Yanfei
Date: Mon Jul 05 2021 - 07:59:36 EST




On 6/30/21 11:35 PM, Peter Zijlstra wrote:
For simpler and better code.

Signed-off-by: Peter Zijlstra (Intel)<peterz@xxxxxxxxxxxxx>
---
kernel/locking/mutex.c | 27 ++++++---------------------
1 file changed, 6 insertions(+), 21 deletions(-)

Hi Peter,

I read the mutex codes today, and find there seems something wrong for the patch. Should we consider the race condition as blow?

From 4035f50c96e17cbe3febab768b64da5c000e5b76 Mon Sep 17 00:00:00 2001
From: Yanfei Xu <yanfei.xu@xxxxxxxxxxxxx>
Date: Mon, 5 Jul 2021 17:56:58 +0800
Subject: [PATCH] locking/mutex: fix the endless loop when racing against
mutex.owner

if a race condition happened on mutex.owner after we fetch its value,
atomic_long_try_cmpxchg_acquire/release invoked on &mutex.owner will
return false. Then we need to reassign the temporary variable which
saves mutex.owner value if in loop, or it will lead an endless loop.

Fixes: 9265e48a579d ("locking/mutex: Use try_cmpxchg()")

Signed-off-by: Yanfei Xu <yanfei.xu@xxxxxxxxxxxxx>
---
kernel/locking/mutex.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 5e6a811ac733..ec6b6724c118 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -95,12 +95,12 @@ static inline unsigned long __owner_flags(unsigned long owner)

static inline struct task_struct *__mutex_trylock_common(struct mutex *lock, bool handoff)
{
- unsigned long owner, curr = (unsigned long)current;
+ unsigned long flags, owner, task, curr = (unsigned long)current;

- owner = atomic_long_read(&lock->owner);
for (;;) { /* must loop, can race against a flag */
- unsigned long flags = __owner_flags(owner);
- unsigned long task = owner & ~MUTEX_FLAGS;
+ owner = atomic_long_read(&lock->owner);
+ flags = __owner_flags(owner);
+ task = owner & ~MUTEX_FLAGS;

if (task) {
if (flags & MUTEX_FLAG_PICKUP) {
@@ -231,10 +231,10 @@ __mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter)
*/
static void __mutex_handoff(struct mutex *lock, struct task_struct *task)
{
- unsigned long owner = atomic_long_read(&lock->owner);
+ unsigned long owner, new;

for (;;) {
- unsigned long new;
+ owner = atomic_long_read(&lock->owner);

MUTEX_WARN_ON(__owner_task(owner) != current);
MUTEX_WARN_ON(owner & MUTEX_FLAG_PICKUP);
@@ -1227,8 +1227,9 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
* Except when HANDOFF, in that case we must not clear the owner field,
* but instead set it to the top waiter.
*/
- owner = atomic_long_read(&lock->owner);
for (;;) {
+ owner = atomic_long_read(&lock->owner);
+
MUTEX_WARN_ON(__owner_task(owner) != current);
MUTEX_WARN_ON(owner & MUTEX_FLAG_PICKUP);

--
2.29.2