Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()
From: Mathieu Desnoyers
Date: Mon Dec 28 2020 - 18:06:37 EST
----- On Dec 28, 2020, at 2:44 PM, Andy Lutomirski luto@xxxxxxxxxx wrote:
> On Mon, Dec 28, 2020 at 11:09 AM Russell King - ARM Linux admin
> <linux@xxxxxxxxxxxxxxx> wrote:
>>
>> On Mon, Dec 28, 2020 at 07:29:34PM +0100, Jann Horn wrote:
>> > After chatting with rmk about this (but without claiming that any of
>> > this is his opinion), based on the manpage, I think membarrier()
>> > currently doesn't really claim to be synchronizing caches? It just
>> > serializes cores. So arguably if userspace wants to use membarrier()
>> > to synchronize code changes, userspace should first do the code
>> > change, then flush icache as appropriate for the architecture, and
>> > then do the membarrier() to ensure that the old code is unused?
^ exactly, yes.
>> >
>> > For 32-bit arm, rmk pointed out that that would be the cacheflush()
>> > syscall. That might cause you to end up with two IPIs instead of one
>> > in total, but we probably don't care _that_ much about extra IPIs on
>> > 32-bit arm?
This was the original thinking, yes. The cacheflush IPI will flush specific
regions of code, and the membarrier IPI issues context synchronizing
instructions.
Architectures with coherent i/d caches don't need the cacheflush step.
>> >
>> > For arm64, I believe userspace can flush icache across the entire
>> > system with some instructions from userspace - "DC CVAU" followed by
>> > "DSB ISH", or something like that, I think? (See e.g.
>> > compat_arm_syscall(), the arm64 compat code that implements the 32-bit
>> > arm cacheflush() syscall.)
>>
>> Note that the ARM cacheflush syscall calls flush_icache_user_range()
>> over the range of addresses that userspace has passed - it's intention
>> since day one is to support cases where userspace wants to change
>> executable code.
>>
>> It will issue the appropriate write-backs to the data cache (DCCMVAU),
>> the invalidates to the instruction cache (ICIMVAU), invalidate the
>> branch target buffer (BPIALLIS or BPIALL as appropriate), and issue
>> the appropriate barriers (DSB ISHST, ISB).
>>
>> Note that neither flush_icache_user_range() nor flush_icache_range()
>> result in IPIs; cache operations are broadcast across all CPUs (which
>> is one of the minimums we require for SMP systems.)
But the ISB AFAIU is not a broadcast. Based on
https://developer.arm.com/documentation/ddi0406/c/Appendices/Barrier-Litmus-Tests/Cache-and-TLB-maintenance-operations-and-barriers/Instruction-cache-maintenance-operations
"Ensuring the visibility of updates to instructions for a multiprocessor"
the ISB needs to be issued on each processor in a multiprocessor system.
This is where membarrier sync-core comes handy.
>>
>> Now, that all said, I think the question that has to be asked is...
>>
>> What is the basic purpose of membarrier?
For basic membarrier (not sync-core), the purpose is to provide memory
barriers across a given set of threads.
For sync-core membarrier, the purpose is to guarantee context synchronizing
instructions on all threads belonging to the same mm.
>>
>> Is the purpose of it to provide memory barriers, or is it to provide
>> memory coherence?
The purpose has never been to provide any kind of memory coherence, as
this would duplicate what is already achieved by other architecture-specific
system calls.
>>
>> If it's the former and not the latter, then cache flushes are out of
>> scope, and expecting memory written to be visible to the instruction
>> stream is totally out of scope of the membarrier interface, whether
>> or not the writes happen on the same or a different CPU to the one
>> executing the rewritten code.
>>
>> The documentation in the kernel does not seem to describe what it's
>> supposed to be doing - the only thing I could find is this:
>> Documentation/features/sched/membarrier-sync-core/arch-support.txt
>> which describes it as "arch supports core serializing membarrier"
>> whatever that means.
It's described in include/uapi/linux/membarrier.h.
>>
>> Seems to be the standard and usual case of utterly poor to non-existent
>> documentation within the kernel tree, or even a pointer to where any
>> useful documentation can be found.
>>
>> Reading the membarrier(2) man page, I find nothing in there that talks
>> about any kind of cache coherency for self-modifying code - it only
>> seems to be about _barriers_ and nothing more, and barriers alone do
>> precisely nothing to save you from non-coherent Harvard caches.
There is no mention of cache coherency because membarrier does not
do anything wrt cache coherency. We should perhaps explicitly point
it out though.
>>
>> So, either Andy has a misunderstanding, or the man page is wrong, or
>> my rudimentary understanding of what membarrier is supposed to be
>> doing is wrong...
>
> Look at the latest man page:
>
> https://man7.org/linux/man-pages/man2/membarrier.2.html
>
> for MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE. The result may not be
> all that enlightening.
As described in the membarrier(2) man page and in include/uapi/linux/membarrier.h,
membarrier MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE guarantees core serializing
instructions in addition to the memory ordering guaranteed by MEMBARRIER_CMD_PRIVATE_EXPEDITED.
It does not guarantee anything wrt i/d cache coherency. I initially considered
adding such guarantees when we introduced membarrier sync-core, but decided
against it because it would basically replicate what architectures already
expose to user-space, e.g. flush_icache_user_range on arm32.
So between code modification and allowing other threads to jump to that code,
it should be expected that architectures without coherent i/d cache will need
to flush caches to ensure coherency *and* to issue membarrier to make sure
core serializing instructions will be issued by every thread acting on the
same mm either immediately by means of the IPI, or before they return to
user-space if they do not happen to be currently running when the membarrier
system call is invoked.
Hoping this clarifies things. I suspect we will need to clarify documentation
about what membarrier *does not* guarantee, given that you mistakenly expected
membarrier to take care of cache flushing.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com