Re: [RFC PATCH v3] membarrier: provide core serialization

From: Will Deacon
Date: Fri Sep 01 2017 - 12:25:49 EST


On Fri, Sep 01, 2017 at 12:10:07PM -0400, Mathieu Desnoyers wrote:
> Add a new MEMBARRIER_FLAG_SYNC_CORE flag to the membarrier
> system call. It allows membarrier to issue core serializing barriers in
> addition to memory barriers on target threads whenever a membarrier
> command is performed.
>
> It is relevant for reclaim of JIT code, which requires to issue core
> serializing barriers on all threads running on behalf of a process
> after ensuring the old code is not visible anymore, before re-using
> memory for new code.
>
> The new MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED used with
> MEMBARRIER_FLAG_SYNC_CORE flag registers the current process as
> requiring core serialization. It may block. It can be used to ensure
> MEMBARRIER_CMD_PRIVATE_EXPEDITED never blocks, even the first time it is
> invoked by a process with the MEMBARRIER_FLAG_SYNC_CORE flag.
>
> * Scheduler Overhead Benchmarks
>
> Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
> Linux v4.13-rc6
>
> Inter-thread scheduling
> taskset 01 ./perf bench sched pipe -T
>
> Avg. usecs/op Std.Dev. usecs/op
> Before this change: 2.55 0.10
> With this change: 2.49 0.08
> SYNC_CORE processes: 2.70 0.10
>
> Inter-process scheduling
> taskset 01 ./perf bench sched pipe
>
> Before this change: 2.93 0.13
> With this change: 2.93 0.13
> SYNC_CORE processes: 3.20 0.06
>
> Changes since v2:
> - Rename MEMBARRIER_CMD_REGISTER_SYNC_CORE to
> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED,

I'm still not convinced that this registration step is needed (at least
for arm, power and x86), but my previous comments were ignored.

> - Introduce the "MEMBARRIER_FLAG_SYNC_CORE" flag.
> - Introduce CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE, only implemented by
> x86 32/64 initially.
> - Introduce arch_membarrier_user_icache_flush, a no-op on x86 32/64,
> which can be implemented on architectures with incoherent data and
> instruction caches. It is associated with
> CONFIG_ARCH_HAS_MEMBARRIER_USER_ICACHE_FLUSH.

Given that MEMBARRIER_FLAG_SYNC_CORE is about flushing the internal CPU
pipeline (iiuc), could we rename this so that it doesn't mention the
I-cache, please? I-cache flushing is a very different operation on most
architectures I'm aware of, and on arm64 it's even available to userspace
(and broadcast in hardware to other cores).

Will