Re: [PATCH -v6 05/13] futex: Change locking rules
From: Peter Zijlstra
Date: Thu Apr 06 2017 - 08:28:59 EST
On Wed, Apr 05, 2017 at 02:18:43PM -0700, Darren Hart wrote:
> On Wed, Mar 22, 2017 at 11:35:52AM +0100, Peter Zijlstra wrote:
> > + *
> > + * Serialization and lifetime rules:
> > + *
> > + * hb->lock:
> > + *
> > + * hb -> futex_q, relation
> > + * futex_q -> pi_state, relation
> > + *
> > + * (cannot be raw because hb can contain arbitrary amount
> > + * of futex_q's)
> > + *
> > + * pi_mutex->wait_lock:
> > + *
> > + * {uval, pi_state}
> > + *
> > + * (and pi_mutex 'obviously')
> > + *
> > + * p->pi_lock:
>
> This documentation uses a mix of types and common variable names. I'd recommend
> some declarations just below "Serialization and lifetime rules:" to help make
> this explicit, e.g.:
>
> struct futex_pi_state *pi_state;
> struct futex_hash_bucket *hb;
> struct rt_mutex *pi_mutex;
> struct futex_q *q;
> task_struct *p;
Yeah, not convinced it helps much. If you're stuck at that level, the
rest of futex is going to make your head explode.
> > @@ -980,10 +1013,12 @@ void exit_pi_state_list(struct task_stru
> > * the pi_state against the user space value. If correct, attach to
> > * it.
> > */
> > +static int attach_to_pi_state(u32 __user *uaddr, u32 uval,
> > + struct futex_pi_state *pi_state,
> > struct futex_pi_state **ps)
> > {
> > pid_t pid = uval & FUTEX_TID_MASK;
> > + int ret, uval2;
>
> The uval should be an unsigned type:
>
> u32 uval2;
Right you are.
> >
> > /*
> > * Userspace might have messed up non-PI and PI futexes [3]
> > @@ -991,9 +1026,34 @@ static int attach_to_pi_state(u32 uval,
> > if (unlikely(!pi_state))
> > return -EINVAL;
> >
> > + /*
> > + * We get here with hb->lock held, and having found a
> > + * futex_top_waiter(). This means that futex_lock_pi() of said futex_q
> > + * has dropped the hb->lock in between queue_me() and unqueue_me_pi(),
>
> This context got here like this:
>
> futex_lock_pi
> hb lock
> futex_lock_pi_atomic
> top waiter
> attach_to_pi_state()
>
> The queue_me and unqueue_me_pi both come after this in futex_lock_pi.
> Also, the hb lock is dropped in queue_me, not between queue_me and
> unqueue_me_pi.
>
> Are you saying that in order to be here, there are at least two tasks contending
> for the lock, and one that has come before us has proceeded as far as queue_me()
> but has not yet entered unqueue_me_pi(), therefor we know there is a waiter and
> it has a pi_state? If so, I think we can make this much clearer by at least
> noting the two tasks in play.
The point is that this other task must have a reference, and since we
now hold hb->lock, it cannot go away.
>
> ...
>
> > @@ -1336,6 +1418,7 @@ static int wake_futex_pi(u32 __user *uad
> >
> > if (cmpxchg_futex_value_locked(&curval, uaddr, uval, newval)) {
> > ret = -EFAULT;
> > +
>
> Stray whitespace addition? Not explicitly against coding-style, but I don't
> normally see a new line before the closing brace leading to an else...
I found it more readable that way. Sod checkpatch and co ;-)
> > } else if (curval != uval) {
> > /*
> > * If a unconditional UNLOCK_PI operation (user space did not