Re: [PATCH 01/13] mm: Update ptep_get_lockless()s comment

From: Nadav Amit
Date: Sat Oct 29 2022 - 17:03:56 EST


On Oct 29, 2022, at 1:56 PM, Nadav Amit <nadav.amit@xxxxxxxxx> wrote:

> On Oct 29, 2022, at 1:15 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
>> (b) we could move the "page_remove_rmap()" into the "flush-and-free" path too
>>
>> And (b) would be fairly easy - same model as that dirty bit patch,
>> just a 'do page_remove_rmap too' - except page_remove_rmap() wants the
>> vma as well (and we delay the TLB flush over multiple vma's, so it's
>> not just a "save vma in mmu_gather”).
>
> (b) sounds reasonable and may potentially allow future performance
> improvements (batching, doing stuff without locks).
>
> It does appear to break a potential hidden assumption that rmap is removed
> while the ptl is acquired (at least in the several instances I samples).
> Yet, anyhow page_vma_mapped_walk() checks the PTE before calling the
> function, so it should be fine.
>
> I’ll give it a try.

I have just seen John’s and your emails. It seems (b) fell off. (a) is out
of my “zone”, and anyhow assuming it would not be solved soon, deferring
page_remove_rmap() might cause regressions.

(c) might be more intrusive and potentially induce overheads. If we need a
small backportable solution, I think the approach that I proposed (marking
the page dirty after the invalidation, before the PTL is released) is the
simplest one.

Please advise how to proceed.