RE: [BUG] Kernel panic when using Hibernation on kernel 6.1.25

From: Jia-hao Bai (白家豪)
Date: Wed Jul 10 2024 - 01:32:25 EST


Hi Pavel,

We have CONFIG_KFENCE enabled, and KFENCE detects use-after-free, invalid reads, and out-of-bounds reads in safe_copy_page(kernel/power/snapshot.c).

Do you know why hibernation encounters these problems when entering suspend?

My environment:
1.Kernel 6.1.25 arm 32bit.
2.DRAM 1GB
3. Memory related configuration:
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_MEMCG=y
CONFIG_MEMCG_KMEM=y
CONFIG_SHMEM=y
CONFIG_MEMBARRIER=y
CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_ARM_DMA_MEM_BUFFERABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_HIGHMEM=y
CONFIG_ARCH_HAS_SET_MEMORY=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
CONFIG_FLATMEM=y
CONFIG_ARCH_KEEP_MEMBLOCK=y
CONFIG_MEMORY_ISOLATION=y
CONFIG_OF_RESERVED_MEM=y
CONFIG_INPUT_FF_MEMLESS=y
CONFIG_V4L2_MEM2MEM_DEV=y
CONFIG_V4L_MEM2MEM_DRIVERS=y
CONFIG_VIDEOBUF2_MEMOPS=y
CONFIG_DRM_GEM_SHMEM_HELPER=m
CONFIG_ASHMEM=y
CONFIG_MEMORY=y
CONFIG_NVMEM=y
CONFIG_NVMEM_SYSFS=y
CONFIG_MEMFD_CREATE=y
CONFIG_HAS_IOMEM=y
CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED=y
CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_HIGHMEM=y
CONFIG_ARCH_USE_MEMTEST=y


[ 33.001597] BUG: KFENCE: use-after-free read in safe_copy_page+0x74/0x88
[ 33.001597]
[ 33.001597] Use-after-free read at 0xefcd9000 pfn:6fcd9 (in kfence-#5):
[ 33.001597] safe_copy_page+0x74/0x88
[ 33.001597] swsusp_save+0x400/0x460
[ 33.001597] arch_save_image+0x8/0x4c
[ 33.001597] cpu_suspend_abort+0x0/0x18
[ 33.001597]
[ 33.001597] kfence-#5: 0xefcd9000-0xefc_fork+0x14/0x2c
[ 33.001597]
[ 33.001597] freed by task 1 on cpu 0 at 0.527036s:
[ 33.001597] tp_initcall_finish_cb+0x17c/0x1a8
[ 33.001597] do_one_initcall+0x13c/0x250
[ 33.001597] kernel_init_freeable+0x23c/0x2a0
[ 33.001597] kernel_init+0x20/0x138
[ 33.001597] ret_from_fork+0x14/0x2c
[ 33.001597]
[ 33.001597] CPU: 0 PID: 776 Comm: sh Tainted: G B W 6.1.25-mainline #1
[ 33.001597] Hardware name: Generic DT based system
[ 33.001597] PC is at safe_copy_page+0x74/0x88
[ 33.001597] LR is at safe_copy_page+0x70/0x88
[ 33.001597] pc : [<c0c9cb3c>] lr : [<c0c9cb38>] psr: 800700d3
[ 33.001597] sp : c1aecf90 ip : c0007081 fp : c139f880
[ 33.001597] r10: ed9394b4 r9 : 002394b4 r8 : c13ae600
[ 33.001597] r7 : eddb8e84 r6 : 0000b6d9 r5 : eddb8e84 r4 : cfd04ffc
[ 33.001597] r3 : efcda000 r2 : 001ae3a1 r1 : 38e38e39 r0 : efcd9000
[ 33.001597] Flags: Nzcv IRQs off FIQs off Mode SVC_32 ISA ARM Segment none
[ 33.001597] ==================================================================
[ 33.001597] BUG: KFENCE: invalid read in safe_copy_page+0x74/0x88
[ 33.001597]
[ 33.001597] Invalid read at 0xefce4000 pfn:6fce4:
[ 33.001597] safe_copy_page+0x74/0x88
[ 33.001597] swsusp_save+0x400/0x460
[ 33.001597] arch_save_image+0x8/0x4c
[ 33.001597] cpu_suspend_abort+0x0/0x18
[ 33.001597]
[ 33.001597] CPU: 0 PID: 776 Comm: sh Tainted: G B W 6.1.25-mainline #1
[ 33.001597] Hardware name: Generic DT based system
[ 33.001597] PC is at safe_copy_page+0x74/0x88
[ 33.001597] LR is at safe_copy_page+0x70/0x88
[ 33.00159 FIQs off Mode SVC_32 ISA ARM Segment none
[ 33.001597] Control: 10c5383d Table: 46c1c06a DAC: 00000051
[ 33.001597] safe_copy_page from swsusp_save+0x400/0x460
[ 33.001597] swsusp_save from arch_save_image+0x8/0x4c
[ 33.001597] arch_save_image from cpu_suspend_abort+0x0/0x18
[ 33.001597] ==================================================================
[ 33.001597] ==================================================================
[ 33.001597] BUG: KFENCE: out-of-bounds read in safe_copy_page+0x74/0x88
[ 33.001597]
[ 33.001597] Out-of-bounds read at 0xefcd8000 pfn:6fcd8 (4096B right of kfence-#4):
[ 33.001597] safe_copy_page+0x74/0x88
[ 33.001597] swsusp_reate_link+0x48/0xb0
[ 33.001597] sysfs_do_create_link_sd+0x6c/0xe8
[ 33.001597] devlink_add_symlinks+0x124/0x25c
[ 33.001597] device_add+0x444/0x784
[ 33.001597] device_link_add+0x1cc/0x5b4
[ 33.001597] fw_devlink_create_devlink+0x94/0x240
[ 33.001597] __fw_devlink_link_to_suppliers+0x58/0xe0
[ 33.001597] device_add+0x6a8/0x784
[ 33.001597] of_platform_device_create_pdata+0x9c/0xcc
[ 33.001597] of_platform_bus_create+0x1a8/0x230
[ 33.001597] of_platform_bus_create+0x1f4/0x230
[ 33.001597] of_platform_populate+0x68/0xc0
[ 33.001597] of_platform_default_populate_init+0xd4/0xec
[ 33.001597] do_one_initcall+0x48/0x250
[ 33.001597] kernel_init_freeable+0x23c/0x2a0
[ 33.001597] kernel_init+0x20/0x138
[ 33.001597] ret_from_fork+0x14/0x2c
[ 33.001597]
[ 33.001597] CPU: 0 PID: 776 Comm: sh Tainted: G B W 6.1.25-mainline #1
[ 33.001597] Hardware name: Generic DT based system
[ 33.001597] PC is at safe_copy_page+0x74/0xr2 : 001ae398 r1 : 38e38e39 r0 : efcd8000
[ 33.001597] Flags: Nzcv IRQs off FIQs off Mode SVC_32 ISA ARM Segment none
[ 33.001597] Control: 10c5383d Table: 46c1c06a DAC: 00000051
[ 33.001597] safe_copy_page from swsusp_save+0x400/0x460
[ 33.001597] swsusp_save from arch_save_image+0x8/0x4c
[ 33.001597] arch_save_image from cpu_suspend_abort+0x0/0x18


thanks

-----Original Message-----
From: Jia-hao Bai (白家豪)
Sent: Friday, June 14, 2024 10:32 AM
To: Pavel Machek <pavel@xxxxxx>
Cc: linux-kernel@xxxxxxxxxxxxxxx; rafael@xxxxxxxxxx; Iverlin Wang (王苳霖) <Iverlin.Wang@xxxxxxxxxxxx>; Boy Wu (吳勃誼) <Boy.Wu@xxxxxxxxxxxx>; Seiya Wang (王迺君) <seiya.wang@xxxxxxxxxxxx>; Dengjun Su (苏邓军) <Dengjun.Su@xxxxxxxxxxxx>; Win Yeh (葉昌倫) <Win.Yeh@xxxxxxxxxxxx>; Sowell Peng (彭首偉) <Sowell.Peng@xxxxxxxxxxxx>; Richard-CC Yang (楊職銓) <Richard-CC.Yang@xxxxxxxxxxxx>
Subject: RE: [BUG] Kernel panic when using Hibernation on kernel 6.1.25

Hi Pavel,

Hardware: Arm Cortex A55, 1GB RAM, eMMC 8G.

Error log:
[ 1362.985700] sh: notify_die from die+0x144/0x5f0 [ 1362.985700] sh: die from die_kernel_fault+0x138/0x148 [ 1362.985700] sh: die_kernel_fault from __do_kernel_fault.part.0+0x5c/0xac
[ 1362.985700] sh: __do_kernel_fault.part.0 from do_translation_fault+0xbc/0xe0 [ 1362.985700] sh: do_translation_fault from do_DataAbort+0x44/0x1d0 [ 1362.985700] sh: do_DataAbort from __dabt_svc+0x4c/0x80 [ 1362.985700] sh: Exception stack(0xc1e8df38 to 0xc1e8df80)
[ 1362.985700] sh: df20: c3000000 00000000
[ 1362.985700] sh: df40: c3001000 ea157ffc ea158000 c17639a8 c17bd548 edceb060 c184b540 c17bd5a0 [ 1362.985700] sh: df60: 00007fd7 ed76c000 00000000 c1e8df88 c0ebe4dc c0ebe4e4 800001d3 ffffffff [ 1362.985700] sh: __dabt_svc from safe_copy_page+0x20/0x4c [ 1362.985700] sh: safe_copy_page from swsusp_save+0x580/0x5ac [ 1362.985700] sh: swsusp_save from arch_save_image+0x8/0x74 [ 1362.985700] sh: arch_save_image from cpu_suspend_abort+0x0/0x18

We compared between K5.4 and K6.1.25 and found that Hibernation does not save the reserve area on K5.4 because it is blocked by pfn_valid.

Therefore, we have added the following workaround to skip some reserved memory sections.

The skip region is obtained from "cat /proc/iomem".
40000000-42fbffff : System RAM
40008000-410fffff : Kernel code
41200000-4144a25f : Kernel data
43100000-4402ffff : System RAM
45140000-593fffff : System RAM
59401000-5940ffff : System RAM
5941f000-594effff : System RAM
59501180-595fffff : System RAM
59640000-7fffffff : System RAM

static unsigned int pfn_is_reserved(unsigned long pfn){
phys_addr_t phys = __pfn_to_phy
if(phys >= 0x42fc0000 && phys< 0x43100000){
return true;
}
if(phys >= 0x44030000 && phys< 0x45140000){
return true;
}
if(phys >= 0x59400000 && phys< 0x59401000){
return true;
}
if(phys >= 0x59410000 && phys< 0x5941f000){
return true;
}
if(phys >= 0x594f0000 && phys< 0x59501180){
return true;
}
if(phys >= 0x59600000 && phys< 0x59640000){
return true;
}
return false;

}

static struct page *saveable_page(struct zone *zone, unsigned long pfn) {
struct page *page;

if (!pfn_valid(pfn)){
return NULL;
}

if(pfn_is_reserved(pfn))
return NULL;
........


Do you know of any patches that can fix this problem?

thanks

-----Original Message-----
From: Pavel Machek <pavel@xxxxxx>
Sent: Thursday, June 13, 2024 6:52 PM
To: Jia-hao Bai (白家豪) <Jia-hao.Bai@xxxxxxxxxxxx>
Cc: linux-kernel@xxxxxxxxxxxxxxx; rafael@xxxxxxxxxx; Iverlin Wang (王苳霖) <Iverlin.Wang@xxxxxxxxxxxx>; Boy Wu (吳勃誼) <Boy.Wu@xxxxxxxxxxxx>; Seiya Wang (王迺君) <seiya.wang@xxxxxxxxxxxx>; Dengjun Su (苏邓军) <Dengjun.Su@xxxxxxxxxxxx>; Win Yeh (葉昌倫) <Win.Yeh@xxxxxxxxxxxx>; Sowell Peng (彭首偉) <Sowell.Peng@xxxxxxxxxxxx>; Richard-CC Yang (楊職銓) <Richard-CC.Yang@xxxxxxxxxxxx>
Subject: Re: [BUG] Kernel panic when using Hibernation on kernel 6.1.25

Hi!

> I encountered a kernel panic when using the Hibernation on kernel version 6.1.25. Below are the details of the issue:
>
> **Description:**
> When I enable CONFIG_HIBERNATION and assign a specific partition for hibernation resuming and perform a specific operation, the system crashes with a kernel panic.
>
> CONFIG_HIBERNATION=y
> CONFIG_PM_STD_PARTITION="/dev/mmcblk0p16"
>
> **Steps to Reproduce:**
> 1. Set printk to level 8: ` echo 8 > /proc/sys/kernel/printk`
> 2. Set up the swap partition: ` mkswap /dev/mmcblk0p16`
> 3. Enable the swap partition: ` swapon -p -3 /dev/mmcblk0p16`
> 4. Configure hibernation resuming settings: ` echo "/dev/mmcblk0p16" > /sys/power/resume`
> 5. Configure hibernation mode:` echo reboot > /sys/power/disk`
> 6. Perform the operation: ` echo disk > /sys/power/state`
> 7. Observe the kernel panic
>
>
> **Expected Behavior:**
> The operation should complete successfully without causing a kernel panic.
>
> **Actual Behavior:**
> The system crashes with a kernel panic.
>
> **Environment:**
> - Kernel version: 6.1.25
> - Distribution: Yocto 4.0 32bit/Kernel 6.1.25 32bit
> - Hardware: Arm Cortex A55, 1GB RAM

We'd need to know way more about the hardware. Also testing with latest mainline would be useful.

Best regards,
Pavel
--
People of Russia, stop Putin before his war on Ukraine escalates.