Re: Udpated sys_membarrier() speedup patch, FYI

From: Paul E. McKenney
Date: Mon Jul 31 2017 - 14:27:46 EST


On Mon, Jul 31, 2017 at 11:00:19AM -0700, Dave Watson wrote:
> Hi Paul,
>
> Thanks for looking at this again!
>
> On 07/27/17 11:12 AM, Paul E. McKenney wrote:
> > Hello!
> >
> > But my main question is whether the throttling shown below is acceptable
> > for your use cases, namely only one expedited sys_membarrier() permitted
> > per scheduling-clock period (1 millisecond on many platforms), with any
> > excess being silently converted to non-expedited form. The reason for
> > the throttling is concerns about DoS attacks based on user code with a
> > tight loop invoking this system call.
>
> We've been using sys_membarrier for the last year or so in a handful
> of places with no issues. Recently we made it an option in our hazard
> pointers implementation, giving us something with performance between
> hazard pointers and RCU:
>
> https://github.com/facebook/folly/blob/master/folly/experimental/hazptr/hazptr-impl.h#L555
>
> Currently hazard pointers tries to free retired memory the same thread
> that did the retire(), so the latency spiked for workloads that were
> retire() heavy. For the moment we dropped back to using mprotect
> hacks.
>
> I've tested Mathieu's v4 patch, it works great. We currently don't
> have any cases where we need SHARED.

Very good!!! May I have your Tested-by? (Or the Tested-by of whoever
did the testing, as the case may be?)

> I also tested the rate-limited version, while better than the current
> non-EXPEDITED SHARED version, we still hit the slow path pretty often.
> I agree with other commenters that returning an error code instead of
> silently slowing down might be better.

If I need to fall back to the rate-limited version, I will add some sort
of error code capability. For the moment, I am hoping that Mathieu's
patch proves acceptable, but will fall back to the rate-limited version
if some fatal problem arises.

> > + case MEMBARRIER_CMD_SHARED_EXPEDITED:
> > + if (num_online_cpus() > 1) {
> > + static unsigned long lastexp;
> > + unsigned long j;
> > +
> > + j = jiffies;
> > + if (READ_ONCE(lastexp) == j) {
> > + synchronize_sched();
> > + WRITE_ONCE(lastexp, j);
>
> It looks like this update of lastexp should be in the other branch?

Good catch, fixed. It is on branch paulmck.2017.08.01a, and will
hopefully not be needed.

Thanx, Paul

> > + } else {
> > + synchronize_sched_expedited();
> > + }
> > + }
> > + return 0;
>