Re: [PATCH 7/8] membarrier: Remove arm (32) support for SYNC_CORE

From: Peter Zijlstra
Date: Thu Jun 17 2021 - 11:14:07 EST


On Thu, Jun 17, 2021 at 05:01:53PM +0200, Peter Zijlstra wrote:
> On Thu, Jun 17, 2021 at 07:00:26AM -0700, Andy Lutomirski wrote:
> > On Thu, Jun 17, 2021, at 6:51 AM, Mark Rutland wrote:
>
> > > It's not clear to me what "the right thing" would mean specifically, and
> > > on architectures with userspace cache maintenance JITs can usually do
> > > the most optimal maintenance, and only need help for the context
> > > synchronization.
> > >
> >
> > This I simply don't believe -- I doubt that any sane architecture
> > really works like this. I wrote an email about it to Intel that
> > apparently generated internal discussion but no results. Consider:
> >
> > mmap(some shared library, some previously unmapped address);
> >
> > this does no heavyweight synchronization, at least on x86. There is
> > no "serializing" instruction in the fast path, and it *works* despite
> > anything the SDM may or may not say.
>
> I'm confused; why do you think that is relevant?
>
> The only way to get into a memory address space is CR3 write, which is
> serializing and will flush everything. Since there wasn't anything
> mapped, nothing could be 'cached' from that location.
>
> So that has to work...

Ooh, you mean mmap where there was something mmap'ed before. Not virgin
space so to say.

But in that case, the unmap() would've caused a TLB invalidate, which on
x86 is IPIs, which is IRET.

Other architectures include I/D cache flushes in their TLB
invalidations -- but as elsewhere in the thread, that might not be
suffient on its own.

But yes, I think TLBI has to imply flushing micro-arch instruction
related buffers for any of that to work.