Re: [PATCH -tip 2/3] sched/wake_q: Relax to acquire semantics

From: Peter Zijlstra
Date: Tue Sep 15 2015 - 05:55:25 EST


On Tue, Sep 15, 2015 at 11:49:49AM +0200, Peter Zijlstra wrote:
> On Mon, Sep 14, 2015 at 02:08:06PM -0700, Davidlohr Bueso wrote:
> > On Mon, 14 Sep 2015, Peter Zijlstra wrote:
> >
> > >On Mon, Sep 14, 2015 at 12:37:23AM -0700, Davidlohr Bueso wrote:
> > >> /*
> > >>+ * Atomically grab the task. If ->wake_q is non-nil (failed cmpxchg)
> > >>+ * then the task is already queued (by us or someone else) and will
> > >>+ * get the wakeup due to that.
> > >> *
> > >>+ * Use acquire semantics to add the next pointer, which pairs with the
> > >>+ * write barrier implied by the wakeup in wake_up_list().
> > >> */
> > >>+ if (cmpxchg_acquire(&node->next, NULL, WAKE_Q_TAIL))
> > >> return;
> > >>
> > >> get_task_struct(task);
> > >
> > >I'm not seeing a _why_ on the acquire semantics. Not saying the patch is
> > >wrong, just saying I want words on why acquire is correct.
> >
> > Well, I was just taking advantage of removing the upper barrier. Considering
> > that the formal semantics, you are right that we need not actual acquire per-se
> > (ie for node->next) but instead merely ensure a barrier in wake_q_add(). This is
> > kind of why I had hinted of going full _relaxed(). We could also rephrase the
> > comment, something like:
> >
> > * Use ACQUIRE semantics to add the next pointer, such that
> > * wake_q_add() implies a full barrier. This pairs with the
> > * write barrier implied by the wakeup in wake_up_list().
> > */
> >
> > What do you think?
>
> Still befuddled. I'm thinking that if you want to remove a barrier,
> you'd remove that second and keep the first. That is RELEASE.
>
> That way, you know the stores prior to the wake queue are done by the
> time you observe the queued entry, and therefore (transitively) know
> those stores are done by the time you do the actual wakeup.
>
> Two issues with that though; firstly RELEASE is not actually guaranteed
> to be transitive -- now the only arch that does not implement it with a
> full barrier is ARGH64, so we could just ask Will, but I'm not sure its
> 'good' to start relying on this.

Never mind, the PPC people will implement this with lwsync and that is
very much not transitive IIRC.

That said, you could do:

smp_mb__before_atomic();
cmpxchg_relaxed();

Which would still be a full barrier and therefore transitive. However
this point still stands:

> Secondly, the wake queues are not concurrent, they're in context, so I
> don't see ordering matter at all. The only reason its a cmpxchg() is
> because there is the (small) possibility of two contexts wanting to wake
> the same task, and we use task_struct storage for the queue.

I don't think we need _any_ barriers here, unless we have concurrent
users of the wake queues (or want to allow any, do we?).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/