Re: [PATCH v4 2/3] mm/khugepaged: Fix GUP-fast interaction by sending IPI
From: Jann Horn
Date: Mon Nov 28 2022 - 14:57:44 EST
On Mon, Nov 28, 2022 at 8:54 PM Yang Shi <shy828301@xxxxxxxxx> wrote:
>
> On Mon, Nov 28, 2022 at 10:03 AM Jann Horn <jannh@xxxxxxxxxx> wrote:
> >
> > Since commit 70cbc3cc78a99 ("mm: gup: fix the fast GUP race against THP
> > collapse"), the lockless_pages_from_mm() fastpath rechecks the pmd_t to
> > ensure that the page table was not removed by khugepaged in between.
> >
> > However, lockless_pages_from_mm() still requires that the page table is not
> > concurrently freed or reused to store non-PTE data. Otherwise, problems
> > can occur because:
> >
> > - deposited page tables can be freed when a THP page somewhere in the
> > mm is removed
> > - some architectures store non-PTE information inside deposited page
> > tables (see radix__pgtable_trans_huge_deposit())
> >
> > Additionally, lockless_pages_from_mm() is also somewhat brittle with
> > regards to page tables being repeatedly moved back and forth, but
> > that shouldn't be an issue in practice.
> >
> > Fix it by sending IPIs (if the architecture uses
> > semi-RCU-style page table freeing) before freeing/reusing page tables.
> >
> > As noted in mm/gup.c, on configs that define CONFIG_HAVE_FAST_GUP,
> > there are two possible cases:
> >
> > 1. CONFIG_MMU_GATHER_RCU_TABLE_FREE is set, causing
> > tlb_remove_table_sync_one() to send an IPI to synchronize with
> > lockless_pages_from_mm().
> > 2. CONFIG_MMU_GATHER_RCU_TABLE_FREE is unset, indicating that all
> > TLB flushes are already guaranteed to send IPIs.
> > tlb_remove_table_sync_one() will do nothing, but we've already
> > run pmdp_collapse_flush(), which did a TLB flush, which must have
> > involved IPIs.
>
> I'm trying to catch up with the discussion after the holiday break. I
> understand you switched from always allocating a new page table page
> (we decided before) to sending IPIs to serialize against fast-GUP,
> this is fine to me.
>
> So the code now looks like:
> pmdp_collapse_flush()
> sending IPI
>
> But the missing part is how we reached "TLB flushes are already
> guaranteed to send IPIs" when CONFIG_MMU_GATHER_RCU_TABLE_FREE is
> unset? ARM64 doesn't do it IIRC. Or did I miss something?
>From arch/arm64/Kconfig:
select MMU_GATHER_RCU_TABLE_FREE
CONFIG_MMU_GATHER_RCU_TABLE_FREE is not a config option that the user
can freely toggle; it is an option selected by the architecture.