Re: [PATCH] mm,do_huge_pmd_numa_page: remove unnecessary TLB flushing code

From: Yang Shi
Date: Tue Jul 20 2021 - 17:00:00 EST


On Tue, Jul 20, 2021 at 7:25 AM Christian Borntraeger
<borntraeger@xxxxxxxxxx> wrote:
>
>
>
> On 20.07.21 08:55, Huang Ying wrote:
> > Before the commit c5b5a3dd2c1f ("mm: thp: refactor NUMA fault
> > handling"), the TLB flushing is done in do_huge_pmd_numa_page() itself
> > via flush_tlb_range().
> >
> > But after commit c5b5a3dd2c1f ("mm: thp: refactor NUMA fault
> > handling"), the TLB flushing is done in migrate_pages() as in the
> > following code path anyway.
> >
> > do_huge_pmd_numa_page
> > migrate_misplaced_page
> > migrate_pages
> >
> > So now, the TLB flushing code in do_huge_pmd_numa_page() becomes
> > unnecessary. So the code is deleted in this patch to simplify the
> > code. This is only code cleanup, there's no visible performance
> > difference.
> >
> > Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
> > Cc: Yang Shi <shy828301@xxxxxxxxx>
> > Cc: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
> > Cc: Mel Gorman <mgorman@xxxxxxx>
> > Cc: Christian Borntraeger <borntraeger@xxxxxxxxxx>
> > Cc: Gerald Schaefer <gerald.schaefer@xxxxxxxxxxxxx>
> > Cc: Heiko Carstens <hca@xxxxxxxxxxxxx>
> > Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> > Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> > Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> > Cc: Michal Hocko <mhocko@xxxxxxxx>
> > Cc: Vasily Gorbik <gor@xxxxxxxxxxxxx>
> > Cc: Zi Yan <ziy@xxxxxxxxxx>
> > ---
> > mm/huge_memory.c | 26 --------------------------
> > 1 file changed, 26 deletions(-)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index afff3ac87067..9f21e44c9030 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -1440,32 +1440,6 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf)
> > goto out;
> > }
> >
> > - /*
> > - * Since we took the NUMA fault, we must have observed the !accessible
> > - * bit. Make sure all other CPUs agree with that, to avoid them
> > - * modifying the page we're about to migrate.
> > - *
> > - * Must be done under PTL such that we'll observe the relevant
> > - * inc_tlb_flush_pending().
> > - *
> > - * We are not sure a pending tlb flush here is for a huge page
> > - * mapping or not. Hence use the tlb range variant
> > - */
> > - if (mm_tlb_flush_pending(vma->vm_mm)) {
> > - flush_tlb_range(vma, haddr, haddr + HPAGE_PMD_SIZE);
> > - /*
> > - * change_huge_pmd() released the pmd lock before
> > - * invalidating the secondary MMUs sharing the primary
> > - * MMU pagetables (with ->invalidate_range()). The
> > - * mmu_notifier_invalidate_range_end() (which
> > - * internally calls ->invalidate_range()) in
> > - * change_pmd_range() will run after us, so we can't
> > - * rely on it here and we need an explicit invalidate.
> > - */
> > - mmu_notifier_invalidate_range(vma->vm_mm, haddr,
> > - haddr + HPAGE_PMD_SIZE);
> > - }
> > CC Paolo/KVM list so we also remove the mmu notifier here. Do we need those
> now in migrate_pages? I am not an expert in that code, but I cant find
> an equivalent mmu_notifier in migrate_misplaced_pages.
> I might be totally wrong, just something that I noticed.

Do you mean the missed mmu notifier invalidate for the THP migration
case? Yes, I noticed that too. But I'm not sure whether it is intended
or just missed.

Zi Yan is the author for THP migration code, he may have some clue.

>
> > pmd = pmd_modify(oldpmd, vma->vm_page_prot);
> > page = vm_normal_page_pmd(vma, haddr, pmd);
> > if (!page)
> >