Re: [PATCH 01/13] mm: Update ptep_get_lockless()s comment

From: Peter Zijlstra
Date: Thu Oct 27 2022 - 03:09:00 EST


On Wed, Oct 26, 2022 at 06:45:16PM +0200, Jann Horn wrote:

> > #endif /* _LINUX_MM_H */
> > diff --git a/mm/memory.c b/mm/memory.c
> > index f88c351aecd4..9bb63b3fbee1 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -1440,6 +1440,11 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb,
> > tlb_remove_tlb_entry(tlb, pte, addr);
> > zap_install_uffd_wp_if_needed(vma, addr, pte, details,
> > ptent);
> > +
> > + if (!force_flush && !tlb->fullmm && details &&
> > + details->zap_flags & ZAP_FLAG_FORCE_FLUSH)
> > + force_flush = 1;
> > +
>
> Hmm... I guess that might work, assuming that there is no other
> codepath we might race with that first turns the present PTE into a
> non-present PTE but keeps the flush queued for later. At least
> codepaths that use the tlb_batched infrastructure are unproblematic...

So I thought the general rule was that if you modify a PTE and have not
unmapped things -- IOW, there's actual concurrency possible on the
thing, then the TLB invalidate needs to happen under pte_lock, since
that is what controls concurrency at the pte level.

As it stands MADV_DONTNEED seems to blatatly violate that general rule.

Then again; I could've missed something and the rules changed?