[PATCH] zram: Using GFP_ATOMIC instead of GFP_KERNEL to allocate bitmap memory in backing_dev_store

From: Dongyun Liu
Date: Thu Nov 30 2023 - 10:22:11 EST


We encountered a deadlock issue when using zram on kernel version 5.10.
The reason is that the backing_dev_store will first hold the
zram->init_lock, and then it will allocate memory with kvmalloc_node. At
this moment, the system may be in a state of memory shortage, so it will
enter the memory swapping process. In zram_bvec_write, we will trigger
our own thread to move data from zram to the backing device to free up
more available memory, which is the work done in the
try_wakeup_zram_wbd. In this function, zram->init_lock will be acquired
again to read zram->bd_wb_limit, which causes a deadlock as follow call
trace.

The latest version of Linux does not have the bug, but a potential risk
in future.If we try to acquire zram->init_lock again in the zram process
(for example, when we need to read zram->bd_wb_limit), there is a risk
of deadlock. The bug is quite elusive, and it only appears with low
probability after conducting a week-long stress test using 50 devices.
Therefore, I suggest passing the GFP_ATOMIC flag to allocate memory in
the backing_dev_store, to avoid similar issues we encountered. Please
consider it, Thank you.

INFO: task init:331 blocked for more than 120 seconds. "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:init state:D stack: 0 pid: 1 ppid: 0 flags:0x04000000
Call trace:
__switch_to+0x244/0x4e4
__schedule+0x5bc/0xc48
schedule+0x80/0x164
rwsem_down_read_slowpath+0x4fc/0xf9c
__down_read+0x140/0x188
down_read+0x14/0x24
try_wakeup_wbd_thread+0x78/0x1ec [zram]
__zram_bvec_write+0x720/0x878 [zram]
zram_bvec_rw+0xa8/0x234 [zram]
zram_submit_bio+0x16c/0x268 [zram]
submit_bio_noacct+0x128/0x3c8
submit_bio+0x1cc/0x3d0
__swap_writepage+0x5c4/0xd4c
swap_writepage+0x130/0x158
pageout+0x1f4/0x478
shrink_page_list+0x9b4/0x1eb8
shrink_inactive_list+0x2f4/0xaa8
shrink_lruvec+0x184/0x340
shrink_node_memcgs+0x84/0x3a0
shrink_node+0x2c4/0x6c4
shrink_zones+0x16c/0x29c
do_try_to_free_pages+0xe4/0x2b4
try_to_free_pages+0x388/0x7b4
__alloc_pages_direct_reclaim+0x88/0x278
__alloc_pages_slowpath+0x4ec/0xf6c
__alloc_pages_nodemask+0x1f4/0x3dc
kmalloc_order+0x54/0x338
kmalloc_order_trace+0x34/0x1bc
__kmalloc+0x5e8/0x9c0
kvmalloc_node+0xa8/0x264
backing_dev_store+0x1a4/0x818 [zram]
dev_attr_store+0x38/0x8c
sysfs_kf_write+0x64/0xc4
kernfs_fop_write_iter+0x168/0x2ac
vfs_write+0x300/0x37c
ksys_write+0x7c/0xf0
__arm64_sys_write+0x20/0x30
el0_svc_common+0xd4/0x270
el0_svc+0x28/0x98
el0_sync_handler+0x8c/0xf0
el0_sync+0x1b4/0x1c0

Signed-off-by: Dongyun Liu <dongyun.liu@xxxxxxxxxxxxx>
---
drivers/block/zram/zram_drv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index d77d3664ca08..ee6c22c50e09 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -514,7 +514,7 @@ static ssize_t backing_dev_store(struct device *dev,

nr_pages = i_size_read(inode) >> PAGE_SHIFT;
bitmap_sz = BITS_TO_LONGS(nr_pages) * sizeof(long);
- bitmap = kvzalloc(bitmap_sz, GFP_KERNEL);
+ bitmap = kmalloc(bitmap_sz, GFP_ATOMIC);
if (!bitmap) {
err = -ENOMEM;
goto out;
--
2.25.1