Re: [PATCH v7 2/2] ARM hibernation / suspend-to-disk

From: TonyHo
Date: Wed Mar 19 2014 - 23:03:32 EST


Hi ,
I'm trying add the hibernation to Freescale imx6Q(Cortex-A9*4), and I use the patches from:
http://lists.infradead.org/pipermail/linux-arm-kernel/2010-December/036055.html
And the TuxOnIce patches:
http://tuxonice.nigelcunningham.com.au/downloads/all/tuxonice-for-linux-3.0-2012-10-23.patch.bz2
The kernel version is 3.0.35. And the SoC have 4 Cores.
When I try to make the hibernation snapshot image, the kernel hint "NULL pointer dereference", when enter the "swsusp_arch_suspend" function. This function comes from the first patch link.
The function is called in file tuxonice_builtin.c , the code snappit is showed below:
int toi_lowlevel_builtin(void){
...
save_processor_state();
printk(KERN_ERR "Will Swsusp_arch_suspend\n");
error = swsusp_arch_suspend();
printk(KERN_ERR "Done Swsusp_arch_suspend\n");
if (error)
printk(KERN_ERR "Error %d hibernating\n", error);
...
}

1. And I have tried to remove the "error = swsusp_arch_suspend()" , and this would fix the "NULL pointer dereference", So I can confirm this function is the problem. But I can't identify which sentence cause the NULL pointer dereference. Detail log is following.

2. Also the "[ 23.058386] ...20%...40% " in below logs confuse me, it seems the cache writing is not complete, it miss the 60% and 80%.

3. When the 'replace the swsusp with tuxonice' is not selected, I use the 'echo disk>/sys/power/state' to make a hibernation, will cause the same problem.
So this question might caused by the patch from the first patch link. The freescale official BSP kernel version is 3.0.35, so the lastest suspend to disk patches is not suitable. Any one have tried the SMP Cortex-A9 hibernation ?

Looking forward to someone would give me a hand.

Detail Logs:
/ # echo >/sys/power/tuxonice/do_hibernate
[ 22.191954] TuxOnIce 3.2, with support for usm, compression, block i/o, swap storage, file storage, userui.
[ 22.202953] Initiating a hibernation cycle.
[ 22.207481] Failed to launch userspace program '/usr/local/sbin/tuxoniceui_text': Error -2
[ 22.215774] Launch userspace program failed.
[ 22.220090] Starting other threads.
[ 22.223537] Freezing processes & syncing filesystems.
[ 22.228793] Stopping fuse filesystems.
[ 22.232549] Freezing user space processes ... (elapsed 0.01 seconds) done.
[ 22.782219] Stopping normal filesystems.
[ 22.800634] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
[ 22.867306] Preparing Image. Try 1.
[ 22.925873] Restarting normal filesystems.
[ 22.957534] Stopping fuse filesystems.
[ 22.961311] Freezing user space processes ... (elapsed 0.00 seconds) done.
[ 22.968803] Stopping normal filesystems.
[ 22.975835] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
[ 23.049812] Starting to save the image..
[ 23.053750] Writing caches...
[ 23.058386] ...20%...40%
[ 23.061313] ====>> iofinish+iobase=55P,iobarmax=8758P
[ 23.066467] ====>> iofinish+iobase=220KB,iobarmax=35032KB
[ 23.071982] ====>> toi_compress_bytes_in=225280B,toi_compress_bytes_out=106086B
[ 23.106327] Waited for i/o due to synchronous I/O 4 times.
[ 23.111848] Doing atomic copy/restore.
[ 23.115602] Enter the Hibernation [imx6_hibernation_begin]
[ 23.122407] udc suspend begins
[ 23.126080] add wake up source irq 51
[ 23.129910] PM: freeze of devices complete after 8.805 msecs
[ 23.136256] PM: late freeze of devices complete after 0.677 msecs
[ 23.142372] Enter the Hibernation [imx6_hibernation_pre_snapshot]
[ 23.148471] Disabling non-boot CPUs ...
[ 23.153364] CPU1: shutdown
[ 23.157274] CPU2: shutdown
[ 23.160823] CPU3: shutdown
[ 23.163908] Will Swsusp_arch_suspend
[ 23.421375] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 23.429477] pgd = a8b3c000
[ 23.432186] [00000000] *pgd=3816c831, *pte=00000000, *ppte=00000000
[ 23.438500] Internal error: Oops: 80000007 [#1] PREEMPT SMP
[ 23.444076] Modules linked in:
[ 23.447150] CPU: 0 Not tainted (3.0.35 #56)
[ 23.451687] PC is at 0x0
[ 23.454223] LR is at 0x0
[ 23.456759] pc : [<00000000>] lr : [<00000000>] psr: 60000093
[ 23.456764] sp : a8b27e90 ip : 00000002 fp : 00000000
[ 23.468251] r10: 804ee340 r9 : 00000000 r8 : 00000000
[ 23.473479] r7 : 804aa4b0 r6 : 00000000 r5 : 804edd28 r4 : 804ed674
[ 23.480010] r3 : 804e4478 r2 : 804cf814 r1 : 00008000 r0 : 00000000
[ 23.486543] Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel
[ 23.493942] Control: 10c53c7d Table: 38b3c04a DAC: 00000015
[ 23.499690]
[ 23.499692] SP: 0xa8b27e10:
[ 23.503972] 7e10 33322020 3336312e 5d383039 00000020 00000000 00000001 00000000 00025b3e
[ 23.512266] 7e30 00000000 0000040f 00000007 00000000 804aa4b0 80033bb4 00000000 00008000
[ 23.520558] 7e50 804cf814 804e4478 804ed674 804edd28 00000000 804aa4b0 00000000 00000000
[ 23.528850] 7e70 804ee340 00000000 00000002 a8b27e90 00000000 00000000 60000093 ffffffff
[ 23.537141] 7e90 00000001 804ed674 804edd28 00000000 804aa4b0 00000000 804ee340 800a700c
[ 23.545435] 7eb0 60000093 804ed684 804edd28 8009fd94 000021ff 00000037 00000004 00000003
[ 23.553726] 7ed0 a8b26000 00000000 804aa000 804ed684 00000000 804ee340 00000000 800a028c
[ 23.562020] 7ef0 8009e848 804cf9c8 a8bee000 00000001 00000001 00000000 a825f3f8 804cfa2c
On 03/19/2014 11:44 PM, Ezequiel Garcia wrote:
On Mar 17, Sebastian Capella wrote:
[..]
Thanks, I've added it like this in arch/arm/Kconfig. I'm sure you
know, but this way also takes care of the CPU_FEROCEON in the default
list since SUSPEND_POSSIBLE already contains it.

config ARCH_HIBERNATION_POSSIBLE
bool
depends on MMU
default y if ARCH_SUSPEND_POSSIBLE

Does this look ok?

I applied this change on top of your patches and tested it on a Kirkwood
Openblocks A6 board, using a resume=/dev/sda2 kernel parameter (iow, without
any U-Boot assistance to resume). Seems to work fine (as you can see here
http://sprunge.us/BJRV). I guess you can add a:

Tested-by: Ezequiel Garcia <ezequiel.garcia@xxxxxxxxxxxxxxxxxx>

On the other side, this board has no pm_power_off() support, which means
kernel_halt() is called after kernel_power_off().

I'm not sure if a NULL pm_power_off() is supported, but this makes my kernel
crash in a reboot notifier that's called twice (first in kernel_power_off
and then in kernel_halt):

Unable to handle kernel paging request at virtual address 00100104
pgd = df634000
[00100104] *pgd=1f5c3831, *pte=00000000, *ppte=00000000
Internal error: Oops: 817 [#1] PREEMPT ARM
CPU: 0 PID: 565 Comm: sh Not tainted 3.14.0-rc6-00002-g06da70a-dirty #24
task: df4d5440 ti: df666000 task.ti: df666000
PC is at led_trigger_unregister+0x3c/0xcc
LR is at led_trigger_unregister+0x20/0xcc
pc : [<c0273090>] lr : [<c0273074>] psr: 60000093
sp : df667e50 ip : 00100100 fp : 000ab294
r10: 00000002 r9 : 00000000 r8 : c0459a6c
r7 : c046c87c r6 : c046c964 r5 : c03ee198 r4 : c046c964
r3 : 00200200 r2 : 00100100 r1 : 00200200 r0 : c046c8e4
Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
Control: 0005397f Table: 1f634000 DAC: 00000015
Process sh (pid: 565, stack limit = 0xdf6661c0)
Stack: (0xdf667e50 to 0xdf668000)
7e40: 00000000 c046c964 c03ee198 00000000
7e60: 00000000 c0273ae0 c0273ab4 c046c94c ffffffff c0036698 c04546c8 ffffffff
7e80: 00000000 00000002 c0446800 c04550e8 df5270c8 c0036b24 00000000 c034cce0
7ea0: c03c4b18 c04550e8 df666018 00000000 c0476f60 c0036b54 00000000 c04550e8
7ec0: df5270c8 c00378ec 00000001 c034cc64 c0455288 c0044cac 0000006b df4270d8
7ee0: df53f480 00000005 00000004 df53f480 00000005 c0042dc4 00000005 df4270d8
7f00: df53f480 00000005 df667f80 df53f480 df5270c0 c01a2bf4 00000005 c0108dd0
7f20: c0108d8c 00000000 00000000 c010c090 00000000 00000000 df4dd280 000acb10
7f40: df667f80 00000005 00000000 00000005 00000000 c00af2b8 fffffff6 c0019d9c
7f60: 00000003 00000000 00000000 df4dd280 000acb10 00000000 00000005 c00af448
7f80: 00000000 00000000 00200200 000aa8b0 00000001 000acb10 00000004 c0009424
7fa0: df666000 c00092c0 000aa8b0 00000001 00000001 000acb10 00000005 00000000
7fc0: 000aa8b0 00000001 000acb10 00000004 00000020 000ab2a8 000ab274 000ab294
7fe0: 00000005 befbd738 0000e1f0 b6edeb4c 60000010 00000001 1fffd831 1fffdc31
[<c0273090>] (led_trigger_unregister) from [<c0273ae0>] (heartbeat_reboot_notifier+0x2c/0x40)
[<c0273ae0>] (heartbeat_reboot_notifier) from [<c0036698>] (notifier_call_chain+0x48/0x9c)
[<c0036698>] (notifier_call_chain) from [<c0036b24>] (__blocking_notifier_call_chain+0x48/0x60)
[<c0036b24>] (__blocking_notifier_call_chain) from [<c0036b54>] (blocking_notifier_call_chain+0x18/0x20)
[<c0036b54>] (blocking_notifier_call_chain) from [<c00378ec>] (kernel_halt+0x14/0x58)
[<c00378ec>] (kernel_halt) from [<c034cc64>] (power_down+0x8c/0xac)
[<c034cc64>] (power_down) from [<c0044cac>] (hibernate+0x1a8/0x1ec)
[<c0044cac>] (hibernate) from [<c0042dc4>] (state_store+0xac/0xb8)
[<c0042dc4>] (state_store) from [<c01a2bf4>] (kobj_attr_store+0x14/0x20)
[<c01a2bf4>] (kobj_attr_store) from [<c0108dd0>] (sysfs_kf_write+0x44/0x48)
[<c0108dd0>] (sysfs_kf_write) from [<c010c090>] (kernfs_fop_write+0xb4/0x14c)
[<c010c090>] (kernfs_fop_write) from [<c00af2b8>] (vfs_write+0xac/0x188)
[<c00af2b8>] (vfs_write) from [<c00af448>] (SyS_write+0x3c/0x78)
[<c00af448>] (SyS_write) from [<c00092c0>] (ret_fast_syscall+0x0/0x2c)
Code: e3a03602 e2822c01 e2833c02 e59f7084 (e58c1004)
---[ end trace 72dd5ccae5489f38 ]---


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/