Re: [PATCH v3 4/4] mm: swap: Swap-out small-sized THP without splitting

From: Ryan Roberts
Date: Tue Feb 27 2024 - 08:38:01 EST


On 05/02/2024 09:51, Barry Song wrote:
> +Chris, Suren and Chuanhua
>
> Hi Ryan,
>
>> + /*
>> + * __scan_swap_map_try_ssd_cluster() may drop si->lock during discard,
>> + * so indicate that we are scanning to synchronise with swapoff.
>> + */
>> + si->flags += SWP_SCANNING;
>> + ret = __scan_swap_map_try_ssd_cluster(si, &offset, &scan_base, order);
>> + si->flags -= SWP_SCANNING;
>
> nobody is using this scan_base afterwards. it seems a bit weird to
> pass a pointer.
>
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1212,11 +1212,13 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>> if (!can_split_folio(folio, NULL))
>> goto activate_locked;
>> /*
>> - * Split folios without a PMD map right
>> - * away. Chances are some or all of the
>> - * tail pages can be freed without IO.
>> + * Split PMD-mappable folios without a
>> + * PMD map right away. Chances are some
>> + * or all of the tail pages can be freed
>> + * without IO.
>> */
>> - if (!folio_entire_mapcount(folio) &&
>> + if (folio_test_pmd_mappable(folio) &&
>> + !folio_entire_mapcount(folio) &&
>> split_folio_to_list(folio,
>> folio_list))
>> goto activate_locked;
>> --
>
> Chuanhua and I ran this patchset for a couple of days and found a race
> between reclamation and split_folio. this might cause applications get
> wrong data 0 while swapping-in.
>
> in case one thread(T1) is reclaiming a large folio by some means, still
> another thread is calling madvise MADV_PGOUT(T2). and at the same time,
> we have two threads T3 and T4 to swap-in in parallel. T1 doesn't split
> and T2 does split as below,

Hi Barry,

Do you have a test case you can share that provokes this problem? And is this a
separate problem to the race you solved with TTU_SYNC or is this solving the
same problem?

Thanks,
Ryan