Re: [RFC PATCH v3] membarrier: provide core serialization

From: Peter Zijlstra
Date: Fri Oct 06 2017 - 17:09:18 EST

On Fri, Oct 06, 2017 at 08:57:56PM +0000, Mathieu Desnoyers wrote:
> Hi Hans,
> I'm currently making sure the
> the 4.14 kernel before the end of the release candidates. Once that
> is done, I plan to post a patch adding a new MEMBARRIER_FLAG_SYNC_CORE
> flag for the 4.15 merge window.
> I have done a bit of research on the various architecture requirements
> for core serialization. Here are my findings so far about
> instructions providing core serialization on the main architectures
> supported by Linux.
> There are two places where we need it: in the interrupt handler for
> the membarrier IPI, and between scheduler execution (which can change
> the current "mm") and return to user-space.
> Please let me know if I missed anything.
> x86: iret, cpuid, wbinvd -> iret currently provides core serialization
> when going back to userspace and at the end of the IPI. There are
> plans to implement a return path without iret in the future, in which
> case I would need to issue an explicit "cpuid" instruction
> (sync_core()) in switch_mm() if the process is registered with

I would much prefer setting a TIF flag that forces the IRET path instead
of doing additional work in switch_mm().

> arm32: returning to user-space provides core serialization. Same at
> the end of membarrier IPI (to be confirmed). aarch64: ERET
> instruction used when returning to user-space provides core sync. Same
> at the end of membarrier IPI (to be confirmed).

I thought Will already confirmed ERET did what we need, no?

> parisc: core serialization is ensured by issuing at least 7
> instructions. We should have at least that when going back to
> user-space (to be confirmed). Similar for IPI.
> [ |
> ] 5-152
> mips: eret instruction used when going back to user-space provides
> core sync on all SMP architectures. Probably same for IPI (to be
> confirmed).
> [ |
> ] p. 121
> on R3k and TX39XX, rfe is used instead, but those are uniprocessor, so
> they do not matter.
> [ |
> ]

> sparc: seems to require an explicit "flush" instruction followed by at
> most 5 instructions to perform core serialization. Not sure if implied
> by return to user-space in any way.

We still have the problem with the virtually indexed archs that we need
to flush I$ on all CPUs.

Some archs have an instruction for this, others do not (or botched it).
So while some archs have a syscall to affect this, it is an integral
part of the use-case for MEMBAR_SYNC_CORE and I feel we must not gloss
over it.