Re: [PATCH 01/13] mm: Update ptep_get_lockless()s comment

From: Peter Zijlstra
Date: Thu Oct 27 2022 - 15:35:43 EST



Just two quick remarks; it's far to late to really think :-)

On Thu, Oct 27, 2022 at 11:13:55AM -0700, Linus Torvalds wrote:

> But "fullmm" is probably even stronger than "mmap write-lock" in that
> it should also mean "no other CPU can be actively using this" - either
> for hardware page table walking, or for GUP.

IIRC fullmm is really: this is the last user and we're taking the whole
mm down -- IOW exit().

> For example, MADV_DONTNEED does this all with just the mmap lock held
> for reading, so we *unless* we have that 'force_flush', we can
>
> (a) have another CPU continue to use the old stale TLB entry for quite a while
>
> (b) yet another CPU (that didn't have a TLB entry, or wanted to write
> to a read-only one ) could take a page fault, and install a *new* PTE
> entry in the same slot, all at the same time.
>
> Now, that's clearly *very* confusing. But being confusing may not mean
> "wrong" - we're still delaying the free of the old entry, so there's
> no use-after-free.

Do we worry about CPU errata where things go side-ways if multiple CPUs
have inconsistent TLB state?