Re: [PATCH -v4 00/10] FUTEX_UNLOCK_PI wobbles

From: Peter Zijlstra
Date: Wed Feb 22 2017 - 10:36:31 EST


On Wed, Feb 22, 2017 at 12:02:44PM +0100, Peter Zijlstra wrote:
> OK, so after having not thought about this, and then spend the last two
> days trying to cram all this nonsense back into my head, I think I have
> a slightly simpler option.
>
> In any case, I'll go respin the patch-set and repost.

That is; what is wrong with the below patch against mainline?

---

kernel/futex.c | 48 +++++++++++++-----------------------------------
1 file changed, 13 insertions(+), 35 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index c591a2a..fafa25a 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1318,12 +1318,18 @@ static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *this,
new_owner = rt_mutex_next_owner(&pi_state->pi_mutex);

/*
- * It is possible that the next waiter (the one that brought
- * this owner to the kernel) timed out and is no longer
- * waiting on the lock.
+ * When we interleave with futex_lock_pi() where it does
+ * rt_mutex_timed_futex_lock(), we might observe @this futex_q waiter,
+ * but the rt_mutex's wait_list can be empty (either still, or again,
+ * depending on which side we land).
+ *
+ * When this happens, give up our locks and try again, giving the
+ * futex_lock_pi() instance time to complete and unqueue_me().
*/
- if (!new_owner)
- new_owner = this->task;
+ if (!new_owner) {
+ raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
+ return -EAGAIN;
+ }

/*
* We pass it to the next owner. The WAITERS bit is always
@@ -2245,43 +2251,15 @@ static int fixup_owner(u32 __user *uaddr, struct futex_q *q, int locked)
}

/*
- * Catch the rare case, where the lock was released when we were on the
- * way back before we locked the hash bucket.
- */
- if (q->pi_state->owner == current) {
- /*
- * Try to get the rt_mutex now. This might fail as some other
- * task acquired the rt_mutex after we removed ourself from the
- * rt_mutex waiters list.
- */
- if (rt_mutex_trylock(&q->pi_state->pi_mutex)) {
- locked = 1;
- goto out;
- }
-
- /*
- * pi_state is incorrect, some other task did a lock steal and
- * we returned due to timeout or signal without taking the
- * rt_mutex. Too late.
- */
- raw_spin_lock_irq(&q->pi_state->pi_mutex.wait_lock);
- owner = rt_mutex_owner(&q->pi_state->pi_mutex);
- if (!owner)
- owner = rt_mutex_next_owner(&q->pi_state->pi_mutex);
- raw_spin_unlock_irq(&q->pi_state->pi_mutex.wait_lock);
- ret = fixup_pi_state_owner(uaddr, q, owner);
- goto out;
- }
-
- /*
* Paranoia check. If we did not take the lock, then we should not be
* the owner of the rt_mutex.
*/
- if (rt_mutex_owner(&q->pi_state->pi_mutex) == current)
+ if (rt_mutex_owner(&q->pi_state->pi_mutex) == current) {
printk(KERN_ERR "fixup_owner: ret = %d pi-mutex: %p "
"pi-state %p\n", ret,
q->pi_state->pi_mutex.owner,
q->pi_state->owner);
+ }

out:
return ret ? ret : locked;