zswap z3fold + memory offline = infinite loop

From: Qian Cai
Date: Tue May 12 2020 - 20:36:57 EST


Put zswap z3fold pages into the memory and then offline those memory would trigger an infinite loop here in

__offline_pages() --> do_migrate_range() because there is no error handling,

if (pfn) {
/*
* TODO: fatal migration failures should bail
* out
*/
do_migrate_range(pfn, end_pfn);

There, isolate_movable_page() will always return -EBUSY because,

if (!mapping->a_ops->isolate_page(page, mode))
goto out_no_isolated;

i.e., z3fold_page_isolate() will always return false because,

zhdr->mapped_count == 2

It should be easy to reproduce. Otherwise, use this one,

https://github.com/cailca/linux-mm/blob/master/random.c

and then watch the console burning with those,

[12661.793667][T566417] failed to isolate pfn 1045b2
[12661.793745][T566417] page:c00c000004116c80 refcount:2 mapcount:0 mapping:00000000999f9672 index:0x0
[12661.793865][T566417] mapping->a_ops:z3fold_aops
[12661.793919][T566417] flags: 0x3fffc000000000()
[12661.793969][T566417] raw: 003fffc000000000 c00c000003cef388 c00c000006b0da08 c000001275b87f6a
[12661.794071][T566417] raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000
[12661.794158][T566417] page dumped because: isolation failed
[12661.794226][T566417] page_owner tracks the page as allocated
[12661.794292][T566417] page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12800(GFP_NOWAIT|__GFP_NOWARN|__GFP_NORETRY)
[12661.794463][T566417] prep_new_page+0x3d0/0x450
[12661.794508][T566417] get_page_from_freelist+0x1bb8/0x27c0
[12661.794575][T566417] __alloc_pages_slowpath.constprop.60+0x240/0x15a0
[12661.794654][T566417] __alloc_pages_nodemask+0x520/0x650
[12661.794715][T566417] alloc_pages_current+0xbc/0x140
[12661.794772][T566417] z3fold_zpool_malloc+0x6cc/0xe20
[12661.794826][T566417] zpool_malloc+0x34/0x50
[12661.794888][T566417] zswap_frontswap_store+0x60c/0xe20
[12661.794942][T566417] __frontswap_store+0x128/0x330
[12661.794995][T566417] swap_writepage+0x58/0x110
[12661.795048][T566417] pageout+0x16c/0xa40
[12661.795092][T566417] shrink_page_list+0x1ab4/0x2490
[12661.795155][T566417] shrink_inactive_list+0x25c/0x710
[12661.795206][T566417] shrink_lruvec+0x444/0x1260
[12661.795274][T566417] shrink_node+0x288/0x9a0
[12661.795330][T566417] do_try_to_free_pages+0x158/0x640
[12661.795383][T566417] page last free stack trace:
[12661.795437][T566417] free_pcp_prepare+0x52c/0x590
[12661.795493][T566417] free_unref_page+0x38/0xf0
[12662.156109][T566417] free_z3fold_page+0x58/0x120
[12662.156131][T566417] free_pages_work+0x148/0x1c0
[12662.156195][T566417] process_one_work+0x310/0x900
[12662.156257][T566417] worker_thread+0x78/0x530
[12662.156306][T566417] kthread+0x1c4/0x1d0
[12662.156354][T566417] ret_from_kernel_thread+0x5c/0x74