Re: [PATCH v2 2/9] mm/rmap: refactor hugetlb pte clearing in try_to_unmap_one

From: Dev Jain

Date: Sat Apr 11 2026 - 12:06:10 EST




On 11/04/26 2:25 pm, Barry Song wrote:
> On Fri, Apr 10, 2026 at 6:32 PM Dev Jain <dev.jain@xxxxxxx> wrote:
>>
>> Simplify the code by refactoring the folio_test_hugetlb() branch into
>> a new function.
>>
>> No functional change is intended.
>>
>> Signed-off-by: Dev Jain <dev.jain@xxxxxxx>
>> ---
>> mm/rmap.c | 116 +++++++++++++++++++++++++++++++-----------------------
>> 1 file changed, 67 insertions(+), 49 deletions(-)
>>
>> diff --git a/mm/rmap.c b/mm/rmap.c
>> index 62a8c912fd788..a9c43e2f6e695 100644
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -1978,6 +1978,67 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio,
>> FPB_RESPECT_WRITE | FPB_RESPECT_SOFT_DIRTY);
>> }
>>
>> +static inline bool unmap_hugetlb_folio(struct vm_area_struct *vma,
>> + struct folio *folio, struct page_vma_mapped_walk *pvmw,
>> + struct page *page, enum ttu_flags flags, pte_t *pteval,
>> + struct mmu_notifier_range *range, bool *walk_done)
>> +{
>
> Can we add a comment before the function explaining what
> the return value means?

Yes I can add that.


>
>> + /*
>> + * The try_to_unmap() is only passed a hugetlb page
>> + * in the case where the hugetlb page is poisoned.
>> + */
>> + VM_WARN_ON_PAGE(!PageHWPoison(page), page);
>> + /*
>> + * huge_pmd_unshare may unmap an entire PMD page.
>> + * There is no way of knowing exactly which PMDs may
>> + * be cached for this mm, so we must flush them all.
>> + * start/end were already adjusted above to cover this
>> + * range.
>> + */
>> + flush_cache_range(vma, range->start, range->end);
>> +
>> + /*
>> + * To call huge_pmd_unshare, i_mmap_rwsem must be
>> + * held in write mode. Caller needs to explicitly
>> + * do this outside rmap routines.
>> + *
>> + * We also must hold hugetlb vma_lock in write mode.
>> + * Lock order dictates acquiring vma_lock BEFORE
>> + * i_mmap_rwsem. We can only try lock here and fail
>> + * if unsuccessful.
>> + */
>> + if (!folio_test_anon(folio)) {
>> + struct mmu_gather tlb;
>> +
>> + VM_WARN_ON(!(flags & TTU_RMAP_LOCKED));
>> + if (!hugetlb_vma_trylock_write(vma)) {
>> + *walk_done = true;
>> + return false;
>> + }
>> +
>
> Sometimes I feel walk_done is misleading, since walk_done with
> ret = false actually means walk_abort.

I'll rename this to exit_walk, so it doesn't collide with
the label names.

>
> So another option is to make this function return a tristate:
> WALK_DONE, WALK_ABORT, WALK_CONT. Then we could drop the
> walk_done argument entirely.

I thought a lot about how to refactor try_to_unmap_one() as
a whole, and couldn't come up with a good solution.

There are these conditions:

1. ret = false => page_vma_mapped_walk_done(), break
2. ret not decided, "continue"
3. ret = true
a) exit the while loop naturally
b) exit prematurely -> page_vma_mapped_walk_done(), break

I had thought about the refactoring method to have an enum for
all conditions. So we can refactor bits of code, return an
enum, but we will still retain ugliness like

if (ret == WALK_DONE)
goto walk_done;
if (ret == WALK_ABORT)
goto walk_abort;
if (ret == WALK_CONTINUE)
continue;

This seemed more of a forced-refactoring to me, IMHO doesn't
reduce the complexity of the function at all.

I don't have a clever solution to get rid of all the label
jumping, so I refactored what I could.

>
> Thanks
> Barry