Re: Rough notes from sys_membarrier() lightning BoF

From: Alan Stern
Date: Mon Sep 18 2017 - 15:04:27 EST

On Sun, 17 Sep 2017, Paul E. McKenney wrote:

> Hello!
> Rough notes from our discussion last Thursday. Please reply to the
> group with any needed elaborations or corrections.
> Adding Andy and Michael on CC since this most closely affects their
> architectures. Also adding Dave Watson and Maged Michael because
> the preferred approach requires that processes wanting to use the
> lightweight sys_membarrier() do a registration step.
> Thanx, Paul
> ------------------------------------------------------------------------
> Problem:
> 1. The current sys_membarrier() introduces an smp_mb() that
> is not otherwise required on powerpc.
> 2. The envisioned JIT variant of sys_membarrier() assumes that
> the return-to-user instruction sequence handling any change
> to the usermode instruction stream, and Andy Lutomirski's
> upcoming changes invalidate this assumption. It is believed
> that powerpc has a similar issue.

> E. Require that threads register before using sys_membarrier() for
> private or JIT usage. (The historical implementation using
> synchronize_sched() would continue to -not- require registration,
> both for compatibility and because there is no need to do so.)
> For x86 and powerpc, this registration would set a TIF flag
> on all of the current process's threads. This flag would be
> inherited by any later thread creation within that process, and
> would be cleared by fork() and exec(). When this TIF flag is set,

Why a TIF flag, and why clear it during fork()? If a process registers
to use private expedited sys_membarrier, shouldn't that apply to
threads it will create in the future just as much as to threads it has
already created?

> the return-to-user path would execute additional code that would
> ensure that ordering and newly JITed code was handled correctly.
> We believe that checks for these TIF flags could be combined with
> existing checks to avoid adding any overhead in the common case
> where the process was not using these sys_membarrier() features.
> For all other architecture, the registration step would be
> a no-op.

Don't we want to fail private expedited sys_membarrier calls if the
process hasn't registered for them? This requires the registration
call to set a flag for the process, even on architectures where no
additional memory barriers are actually needed. It can't be a no-op.

Alan Stern