Re: Udpated sys_membarrier() speedup patch, FYI
From: Paul E. McKenney
Date: Fri Jul 28 2017 - 14:14:58 EST
On Fri, Jul 28, 2017 at 10:37:25AM -0700, Andrew Hunter wrote:
> On Thu, Jul 27, 2017 at 12:06 PM, Paul E. McKenney
> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > IPIin only those CPUs running threads in the same process as the
> > thread invoking membarrier() would be very nice! There is some LKML
> > discussion on this topic, which is currently circling around making this
> > determination reliable on all CPU families. ARM and x86 are thought
> > to be OK, PowerPC is thought to require a smallish patch, MIPS is
> > a big question mark, and so on.
>
> I'm not sure what you mean by the determination or how this is arch specific?
It looks like Peter and Mathieu are well on the way to solving this,
see his latest patch.
> > But I am surprised when you say that the downgrade would not work, at
> > least if you are not running with nohz_full CPUs. The rcu_sched_qs()
> > function simply sets a per-CPU quiescent-state flag. The needed strong
> > ordering is instead supplied by the combination of the code starting
> > the grace period, reporting the setting of the quiescent-state flag
> > to core RCU, and the code completing the grace period. Each non-idle
> > CPU will execute full memory barriers either in RCU_SOFTIRQ context,
> > on entry to idle, on exit from idle, or within the grace-period kthread.
> > In particular, a CPU running the same usermode thread for the entire
> > grace period will execute the needed memory barriers in RCU_SOFTIRQ
> > context shortly after taking a scheduling-clock interrupt.
>
> Recall that I need more than just a memory barrier--also to interrupt
> RSEQ critical sections in progress on those CPUs. I know this isn't
> general purpose, I'm just saying a trivial downgrade wouldn't work for
> me. :) It would probably be sufficient to set NOTIFY_RESUME on all
> cpus running my code (which is what my IPI function does anyway...)
OK, yes, one major goal of the slowboat sys_membarrier is to -avoid-
IPIing other CPUs, and if you need the CPUs to be IPIed, then a
non-expedited grace period isn't going to do it for you.
And yes, once sys_membarrier() settles a bit, hopefully early next
week, it would be good to work out some way for RSEQ to share the
sys_membarrier() code. Maybe RSEQ adds a bit to the flags argument or
some such?
Thanx, Paul