Re: [PATCH] doc: Update wake_up() & co. memory-barrier guarantees
From: Peter Zijlstra
Date: Mon Jun 25 2018 - 10:18:55 EST
On Mon, Jun 25, 2018 at 03:16:43PM +0200, Andrea Parri wrote:
> > > A concrete example being the store-buffering pattern reported in [1].
> >
> > Well, that example only needs a store->load barrier. It so happens
> > smp_mb() is the only one actually doing that, but imagine we had a
> > weaker barrier that did just that, one that did not imply the full
> > transitivity smp_mb() does.
> >
> > Then the example from [1] could use that weaker thing.
>
> Absolutely (and that would be "fence w,r" on RISC-V, IIUC).
Ah cute. What is the transitivity model of those "fence" instructions? I
see their smp_mb() is "fence rw,rw" and smp_mb() must be RSsc. Otoh
their smp_wmb() is "fence w,w" which is only only required to be RCpc.
So what does RISC-V do for "w,w" and "w,r" like things?
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index a98d54cd5535..8374d01b2820 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1879,7 +1879,9 @@ static void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
> > * C) LOCK of the rq(c1)->lock scheduling in task
> > *
> > * Transitivity guarantees that B happens after A and C after B.
> > - * Note: we only require RCpc transitivity.
> > + * Note: we only require RCpc transitivity for these cases,
> > + * but see smp_mb__after_spinlock() for why rq->lock is required
> > + * to be RCsc.
> > * Note: the CPU doing B need not be c0 or c1
>
> FWIW, we discussed this pattern here:
>
> http://lkml.kernel.org/r/20171018010748.GA4017@andrea
That's not the patter from smp_mb__after_spinlock(), right? But the
other two from this comment.
> > @@ -1966,6 +1969,10 @@ static void ttwu_queue(struct task_struct *p, int cpu, int wake_flags)
> > * Atomic against schedule() which would dequeue a task, also see
> > * set_current_state().
> > *
> > + * Implies at least a RELEASE such that the waking task is guaranteed to
> > + * observe the stores to the wait-condition; see set_task_state() and the
> > + * Program-Order constraints.
>
> [s/set_task_task/set_current_state ?]
Yes, we got rid of set_task_state(), someone forgot to tell my fingers
:-)
> I'd stick to "Implies/Executes at least a full barrier"; this is in fact
> already documented in the function body:
>
> /*
> * If we are going to wake up a thread waiting for CONDITION we
> * need to ensure that CONDITION=1 done by the caller can not be
> * reordered with p->state check below. This pairs with mb() in
> * set_current_state() the waiting thread does.
> */
>
> (this is, again, that "store->load barrier"/SB).
>
> I'll try to integrate these changes in v2, if there is no objection.
Thanks!