Re: [PATCH] mm: swapfile: avoid split_swap_cluster() NULL pointer dereference

From: Huang, Ying
Date: Sun Sep 27 2020 - 01:33:30 EST


Rafael Aquini <aquini@xxxxxxxxxx> writes:

> On Fri, Sep 25, 2020 at 11:21:58AM +0800, Huang, Ying wrote:
>> Rafael Aquini <aquini@xxxxxxxxxx> writes:
>> >> Or, can you help to run the test with a debug kernel based on upstream
>> >> kernel. I can provide some debug patch.
>> >>
>> >
>> > Sure, I can set your patches to run with the test cases we have that tend to
>> > reproduce the issue with some degree of success.
>>
>> Thanks!
>>
>> I found a race condition. During THP splitting, "head" may be unlocked
>> before calling split_swap_cluster(), because head != page during
>> deferred splitting. So we should call split_swap_cluster() before
>> unlocking. The debug patch to do that is as below. Can you help to
>> test it?
>>
>
>
> I finally could grab a good crashdump and confirm that head is really
> not locked.

Thanks! That's really helpful for us to root cause the bug.

> I still need to dig into it to figure out more about the
> crash. I guess that your patch will guarantee that lock on head, but
> it still doesn't help on explaining how did we get the THP marked as
> PG_swapcache, given that it should fail add_to_swap()->get_swap_page()
> right?

Because ClearPageCompound(head) is called in __split_huge_page(), then
all subpages except "page" are unlocked. So previously, when
split_swap_cluster() is called in split_huge_page_to_list(), the THP has
been split already and "head" may be unlocked. Then the normal page
"head" can be added to swap cache.

CPU1 CPU2
---- ----
deferred_split_scan()
split_huge_page(page) /* page isn't compound head */
split_huge_page_to_list(page, NULL)
__split_huge_page(page, )
ClearPageCompound(head)
/* unlock all subpages except page (not head) */
add_to_swap(head) /* not THP */
get_swap_page(head)
add_to_swap_cache(head, )
SetPageSwapCache(head)
if PageSwapCache(head)
split_swap_cluster(/* swap entry of head */)
/* Deref sis->cluster_info: NULL accessing! */

> I'll give your patch a run over the weekend, hopefully we'll have more
> info on this next week.

Thanks!

Best Regards,
Huang, Ying

>> Best Regards,
>> Huang, Ying
>>
>> ------------------------8<----------------------------
>> From 24ce0736a9f587d2dba12f12491c88d3e296a491 Mon Sep 17 00:00:00 2001
>> From: Huang Ying <ying.huang@xxxxxxxxx>
>> Date: Fri, 25 Sep 2020 11:10:56 +0800
>> Subject: [PATCH] dbg: Call split_swap_clsuter() before unlock page during
>> split THP
>>
>> ---
>> mm/huge_memory.c | 13 +++++++------
>> 1 file changed, 7 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index faadc449cca5..8d79e5e6b46e 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -2444,6 +2444,12 @@ static void __split_huge_page(struct page *page, struct list_head *list,
>>
>> remap_page(head);
>>
>> + if (PageSwapCache(head)) {
>> + swp_entry_t entry = { .val = page_private(head) };
>> +
>> + split_swap_cluster(entry);
>> + }
>> +
>> for (i = 0; i < HPAGE_PMD_NR; i++) {
>> struct page *subpage = head + i;
>> if (subpage == page)
>> @@ -2678,12 +2684,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
>> }
>>
>> __split_huge_page(page, list, end, flags);
>> - if (PageSwapCache(head)) {
>> - swp_entry_t entry = { .val = page_private(head) };
>> -
>> - ret = split_swap_cluster(entry);
>> - } else
>> - ret = 0;
>> + ret = 0;
>> } else {
>> if (IS_ENABLED(CONFIG_DEBUG_VM) && mapcount) {
>> pr_alert("total_mapcount: %u, page_count(): %u\n",
>> --
>> 2.28.0
>>