Sean Christopherson <seanjc@xxxxxxxxxx> writes:
Thanks, I think you are correct. By looking into commit 7066f0f933a1
("mm: thp: fix mmu_notifier in migrate_misplaced_transhuge_page()"),
the tlb flush and mmu notifier invalidate were needed since the old
numa fault implementation didn't change PTE to migration entry so it
may cause data corruption due to the writes from GPU secondary MMU.
The refactor does use the generic migration code which converts PTE to
migration entry before copying data to the new page.
That's my understanding as well, based on this blurb from commit 7066f0f933a1.
The standard PAGE_SIZEd migrate_misplaced_page is less accelerated and
uses the generic migrate_pages which transitions the pte from
numa/protnone to a migration entry in try_to_unmap_one() and flushes TLBs
and all mmu notifiers there before copying the page.
That analysis/justification for removing the invalidate_range() call should be
captured in the changelog. Confirmation from Andrea would be a nice bonus.
When we flush CPU TLB for a page that may be shared with device/VM TLB,
we will call MMU notifiers for the page to flush the device/VM TLB.
Right? So when we replaced CPU TLB flushing in do_huge_pmd_numa_page()
with that in try_to_migrate_one(), we will replace the MMU notifiers
calling too. Do you agree?