Re: linux-next: manual merge of the akpm-current tree with the tip tree

From: Nadav Amit
Date: Sun Aug 13 2017 - 02:07:18 EST


Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

>
> Ok, so I have the below to still go on-top.
>
> Ideally someone would clarify the situation around
> mm_tlb_flush_nested(), because ideally we'd remove the
> smp_mb__after_atomic() and go back to relying on PTL alone.
>
> This also removes the pointless smp_mb__before_atomic()
>
> ---
> Subject: mm: Fix barriers for the tlb_flush_pending thing
> From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Date: Fri Aug 11 12:43:33 CEST 2017
>
> I'm not 100% sure we always care about the same PTL and when we have
> SPLIT_PTE_PTLOCKS and have RCpc locks (PPC) the UNLOCK of one does not
> in fact order against the LOCK of another lock. Therefore the
> documented scheme does not work if we care about multiple PTLs
>
> mm_tlb_flush_pending() appears to only care about a single PTL:
>
> - arch pte_accessible() (x86, arm64) only cares about that one PTE.
> - do_huge_pmd_numa_page() also only cares about a single (huge) page.
> - ksm write_protect_page() also only cares about a single page.
>
> however mm_tlb_flush_nested() is a mystery, it appears to care about
> anything inside the range. For now rely on it doing at least _a_ PTL
> lock instead of taking _the_ PTL lock.

It does not care about âanythingâ inside the range, but only on situations
in which there is at least one (same) PT that was modified by one core and
then read by the other. So, yes, it will always be _the_ same PTL, and not
_a_ PTL - in the cases that flush is really needed.

The issue that might require additional barriers is that
inc_tlb_flush_pending() and mm_tlb_flush_nested() are called when the PTL is
not held. IIUC, since the release-acquire might not behave as a full memory
barrier, this requires an explicit memory barrier.

> Therefore add an explicit smp_mb__after_atomic() to cure things.
>
> Also remove the smp_mb__before_atomic() on the dec side, as its
> completely pointless. We must rely on flush_tlb_range() to DTRT.

Good. It seemed fishy to me, but I was focused on the TLB consistency and
less on the barriers (thatâs my excuse).

Nadav