Re: [PATCH -v6 07/13] futex: Rework inconsistent rt_mutex/futex_q state

From: Darren Hart
Date: Wed Apr 05 2017 - 17:59:16 EST


On Wed, Mar 22, 2017 at 11:35:54AM +0100, Peter Zijlstra wrote:
> There is a weird state in the futex_unlock_pi() path when it
> interleaves with a concurrent futex_lock_pi() at the point where it
> drops hb->lock.
>
> In this case, it can happen that the rt_mutex wait_list and the
> futex_q disagree on pending waiters, in particular rt_mutex will find
> no pending waiters where futex_q thinks there are.
>
> In this case the rt_mutex unlock code cannot assign an owner.
>
> What the current code does in this case is use the futex_q waiter that
> got us here; however when the rt_mutex_timed_futex_lock() has already
> failed; this leaves things in a weird state, resulting in much
> head-aches in fixup_owner().
>
> Simplify all this by changing wake_futex_pi() to return -EAGAIN when
> this situation occurs. This then gives the futex_lock_pi() code the
> opportunity to continue and the retried futex_unlock_pi() will now
> observe a coherent state.
>
> The only problem is that this breaks RT timeliness guarantees. That
> is, consider the following scenario:
>
> T1 and T2 are both pinned to CPU0. prio(T2) > prio(T1)
>
> CPU0
>
> T1
> lock_pi()
> queue_me() <- Waiter is visible
>
> preemption
>
> T2
> unlock_pi()
> loops with -EAGAIN forever
>
> Which is undesirable for PI primitives. Future patches will rectify
> this. For now we want to get rid of the fixup magic.

Errrrm... OK... I don't like the idea of having this broken after this commit,
but until I internalize the remaining 5 (that number has never seemed quite so
dauntingly large before... 5...) I can't comment on the alternative. I suppose
having it documented in the commit log means anyone backporting only up to this
point gets what they deserve.

A good patch *removing* code from futex.c is always nice though !

--
Darren Hart
VMware Open Source Technology Center