Re: [PATCH v3 4/4] mm: swap: Swap-out small-sized THP without splitting

From: Ryan Roberts
Date: Tue Feb 27 2024 - 07:29:49 EST


On 05/02/2024 09:51, Barry Song wrote:
> +Chris, Suren and Chuanhua
>
> Hi Ryan,
>
>> + /*
>> + * __scan_swap_map_try_ssd_cluster() may drop si->lock during discard,
>> + * so indicate that we are scanning to synchronise with swapoff.
>> + */
>> + si->flags += SWP_SCANNING;
>> + ret = __scan_swap_map_try_ssd_cluster(si, &offset, &scan_base, order);
>> + si->flags -= SWP_SCANNING;
>
> nobody is using this scan_base afterwards. it seems a bit weird to
> pass a pointer.
>
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1212,11 +1212,13 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>> if (!can_split_folio(folio, NULL))
>> goto activate_locked;
>> /*
>> - * Split folios without a PMD map right
>> - * away. Chances are some or all of the
>> - * tail pages can be freed without IO.
>> + * Split PMD-mappable folios without a
>> + * PMD map right away. Chances are some
>> + * or all of the tail pages can be freed
>> + * without IO.
>> */
>> - if (!folio_entire_mapcount(folio) &&
>> + if (folio_test_pmd_mappable(folio) &&
>> + !folio_entire_mapcount(folio) &&
>> split_folio_to_list(folio,
>> folio_list))
>> goto activate_locked;
>> --
>
> Chuanhua and I ran this patchset for a couple of days and found a race
> between reclamation and split_folio. this might cause applications get
> wrong data 0 while swapping-in.

I can't claim to fully understand the problem yet (thanks for all the details -
I'll keep reading it and looking at the code until I do), but I guess this
problem should exist today for PMD-mappable folios? We already skip splitting
those folios if they are pmd-mapped. Or does the problem only apply to
pte-mapped folios?