Re: [PATCH v9 0/5] Migrate on fault for device pages

From: Alistair Popple

Date: Tue May 05 2026 - 03:20:21 EST


Thanks for doing this work Mika. I've been meaning to take a look at this series
for a while. I'm currently at LSFMM but will try and take a look this week or
next as it sounds quite useful.

- Alistair

On 2026-05-05 at 15:16 +1000, mpenttil@xxxxxxxxxx wrote...
> From: Mika Penttilä <mpenttil@xxxxxxxxxx>
>
> Currently, the way device page faulting and migration works
> is not optimal, if you want to do both fault handling and
> migration at once.
>
> Being able to migrate not present pages (or pages mapped with incorrect
> permissions, eg. COW) to the GPU requires doing either of the
> following sequences:
>
> 1. hmm_range_fault() - fault in non-present pages with correct permissions, etc.
> 2. migrate_vma_*() - migrate the pages
>
> Or:
>
> 1. migrate_vma_*() - migrate present pages
> 2. If non-present pages detected by migrate_vma_*():
> a) call hmm_range_fault() to fault pages in
> b) call migrate_vma_*() again to migrate now present pages
>
> The problem with the first sequence is that you always have to do two
> page walks even when most of the time the pages are present or zero page
> mappings so the common case takes a performance hit.
>
> The second sequence is better for the common case, but far worse if
> pages aren't present because now you have to walk the page tables three
> times (once to find the page is not present, once so hmm_range_fault()
> can find a non-present page to fault in and once again to setup the
> migration). It is also tricky to code correctly. One page table walk
> could costs over 1000 cpu cycles on X86-64, which is a significant hit.
>
> We should be able to walk the page table once, faulting
> pages in as required and replacing them with migration entries if
> requested.
>
> Add a new flag to HMM APIs, HMM_PFN_REQ_MIGRATE,
> which tells to prepare for migration also during fault handling.
> Also, for the migrate_vma_setup() call paths, a flag, MIGRATE_VMA_FAULT,
> is added to tell to add fault handling to migrate.
>
> One extra benefit of migrating with hmm_range_fault() path
> is the migrate_vma.vma gets populated, so no need to
> retrieve that separataly.
>
> Tested in X86-64 VM with HMM test device, passing the selftests.
> For performance, the migrate throughput tests from the selftests
> show similar numbers (within error margin) as unmodified kernel.
> Tested also rebased on the
> "Remove device private pages from physical address space" series:
> https://lore.kernel.org/linux-mm/20260130111050.53670-1-jniethe@xxxxxxxxxx/
> plus a small patch to adjust with no problems.
>
> Changes v8-v9
> - rebase on drm-tip
> - fixed uaf around migrate_vma_split_folio() usage
> - added missing pmd unlock
>
> Changes v7-v8
> - rebase on 7.0
> - fixed subject in two patches
> - enhanced commit messages
> - squashed patch 6 into patch 4 to fix kernel test robot warning
> - readded dropped Cc block from cover letter
> - fixed white space
>
> Changes v6-v7
> - rebase on 7.0.0-rc6
> - added documentation and comments
> - denote to be migrated zero page as HMM_PFN_MIGRATE alone
> - got rid of HMM_PFN_INOUT_FLAGS movement in patch 2
> - picked up Acked-By from David for patch 1
>
> Changes v5-v6
> - rebase on 7.0.0-rc4
> - use range based TLB flushing while unmapping ptes
> - gate migration behind HMM_PFN_REQ_MIGRATE for fault and
> migrate paths
> - always infer migration flags from migrate->flags only
>
> Changes v4-v5
> - rebase on 6.19
> - fixed David's email address
> - fixed link issue without CONFIG_TRANSPARENT_HUGEPAGE
> - refactored into smaller commits
> - added more comments to code
>
> Changes v3-v4:
> - rebase on 6.19-rc8
> - fixed issues found by kernel test robot with random configs
> - fixed typos
>
> Changes v2-v3:
> - rebase on 6.19-rc7
> - fixed issues found by kernel test robot
> - fixed smatch issues reported by Dan Carpenter <dan.carpenter@xxxxxxxxxx>
> - fixes to lock handling (pmd/pte) on errors
> - added assertions for pmd/pte lock states
> - other issues discovered by Matthew, thanks!
>
> Changes v1-v2:
> - rebase on 6.19-rc6
> - fixed issues found by kernel test robot
> - fixed locking (pmd/ptl) to cover handle_ and prepare_ regions
> parts if migrating
> - other issues discovered by Matthew, thanks!
>
> Changes RFC-v1:
> - rebase on 6.19-rc5
> - adjust for the device THP
> - changes from feedback
>
> Revisions:
> - RFC https://lore.kernel.org/linux-mm/20250814072045.3637192-1-mpenttil@xxxxxxxxxx/
> - v1: https://lore.kernel.org/all/20260114091923.3950465-1-mpenttil@xxxxxxxxxx/
> - v2: https://lore.kernel.org/all/20260119112502.645059-1-mpenttil@xxxxxxxxxx/
> - v3: https://lore.kernel.org/all/20260126111939.1332983-2-mpenttil@xxxxxxxxxx/
> - v4: https://lore.kernel.org/all/20260202112622.2104213-1-mpenttil@xxxxxxxxxx/
> - v5: https://lore.kernel.org/linux-mm/20260211081301.2940672-1-mpenttil@xxxxxxxxxx/
> - v6: https://lore.kernel.org/linux-mm/20260316062407.3354636-1-mpenttil@xxxxxxxxxx/
> - v7: https://lore.kernel.org/linux-mm/20260330115611.347988-1-mpenttil@xxxxxxxxxx/
> - v8: https://lore.kernel.org/linux-mm/20260414041226.1539439-1-mpenttil@xxxxxxxxxx/
>
> Cc: David Hildenbrand <david@xxxxxxxxxx>
> Cc: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Cc: Leon Romanovsky <leonro@xxxxxxxxxx>
> Cc: Alistair Popple <apopple@xxxxxxxxxx>
> Cc: Balbir Singh <balbirs@xxxxxxxxxx>
> Cc: Zi Yan <ziy@xxxxxxxxxx>
> Cc: Matthew Brost <matthew.brost@xxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>
> Cc: "Liam R. Howlett" <Liam.Howlett@xxxxxxxxxx>
> Cc: Vlastimil Babka <vbabka@xxxxxxx>
> Cc: Mike Rapoport <rppt@xxxxxxxxxx>
> Cc: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> Cc: Michal Hocko <mhocko@xxxxxxxx>
>
> Mika Penttilä (5):
> mm/Kconfig: changes for migrate on fault for device pages
> mm: Add helper to convert HMM pfn to migrate pfn
> mm/hmm: do the plumbing for HMM to participate in migration
> mm: setup device page migration in HMM pagewalk
> lib/test_hmm:: add a new testcase for the migrate on fault
>
> include/linux/hmm.h | 19 +-
> include/linux/migrate.h | 26 +-
> lib/test_hmm.c | 101 ++-
> lib/test_hmm_uapi.h | 19 +-
> mm/Kconfig | 2 +
> mm/hmm.c | 835 +++++++++++++++++++++++--
> mm/migrate_device.c | 583 +++--------------
> tools/testing/selftests/mm/hmm-tests.c | 54 ++
> 8 files changed, 1066 insertions(+), 573 deletions(-)
>
> drm-tip
> base-commit: 94d56a898a2db27f841b17f6966a81ba502fe63c
> --
> 2.50.0
>