Re: [PATCH v10 00/12] LUF(Lazy Unmap Flush) reducing tlb numbers over 90%

From: Byungchul Park
Date: Wed May 29 2024 - 01:01:06 EST


On Tue, May 28, 2024 at 08:14:43AM -0700, Dave Hansen wrote:
> On 5/26/24 20:10, Huang, Ying wrote:
> >> Thank you for the pointing out. I will fix it too by introducing a new
> >> flag in inode or something to make LUF aware if updating the file has
> >> been tried so that LUF can give up and flush right away in the case.
> >>
> >> Plus, I will add another give-up at code changing the permission of vma
> >> to writable.
> > I guess that you need a framework similar as
> > "flush_tlb_batched_pending()" to deal with interaction with other TLB
> > related operations.
>
> Where "other TLB related operations" includes both things that
> traditionally invalidate TLBs (like going Present 1=>0) and things like
> fault-in that go Present 0=>1 that can result in TLB population.
>
> It's actually a really crummy problem to solve. We don't have _any_
> machinery to say, "Hey, you know that PTE you wanted to install? There
> was something there before you and we haven't flushed it yet. Can you
> be a doll and do a flush before _populating_ that PTE?"

All the code updating ptes already performs TLB flush needed in a safe
way if it's inevitable e.g. munmap. LUF which controls when to flush in
a higer level than arch code, just leaves stale ro tlb entries that are
currently supposed to be in use. Could you give a scenario that you are
concering?

Byungchul

> To solve it generically, I suspect you'll need some kind of special
> non-present PTE to say:
>
> There _was_ a PTE here that wasn't flushed.
>
> Sure, you can add gunk to the VMA to track when this happens. But
> that'll penalize anyone populating a PTE anywhere in the VMA at least
> once. If there were other threads faulting in pages to the same VMA,
> they'll just end up doing the flush that LUF tried to avoid in the first
> place.