Re: [PATCH][RESEND] Revert "PM: ACPI: reboot: Use S5 for reboot"

From: Josef Bacik
Date: Wed Mar 17 2021 - 12:26:28 EST


On 3/16/21 10:50 PM, Kai-Heng Feng wrote:
Hi,

On Wed, Mar 17, 2021 at 10:17 AM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:

This reverts commit d60cd06331a3566d3305b3c7b566e79edf4e2095.

This patch causes a panic when rebooting my Dell Poweredge r440. I do
not have the full panic log as it's lost at that stage of the reboot and
I do not have a serial console. Reverting this patch makes my system
able to reboot again.

But this patch also helps many HP laptops, so maybe we should figure
out what's going on on Poweredge r440.
Does it also panic on shutdown?


Sure I'll test whatever to get it fixed, but I just wasted 3 days bisecting and lost a weekend of performance testing on btrfs because of this regression, so until you figure out how it broke it needs to be reverted so people don't have to figure out why reboot suddenly isn't working.

Running "halt" has the same effect with and without your patch, it gets to "system halted" and just sits there without powering off. Not entirely sure why that is, but there's no panic.

The panic itself is lost, but I see there's an NMI and I have the RIP

(gdb) list *('mwait_idle_with_hints.constprop.0'+0x4b)
0xffffffff816dabdb is in mwait_idle_with_hints (./arch/x86/include/asm/current.h:15).
10
11 DECLARE_PER_CPU(struct task_struct *, current_task);
12
13 static __always_inline struct task_struct *get_current(void)
14 {
15 return this_cpu_read_stable(current_task);
16 }
17
18 #define current get_current()
19

<mwait_idle_with_hints.constprop.0>: jmp 0xffffffff936dac02 <mwait_idle_with_hints.constprop.0+0x72>
<mwait_idle_with_hints.constprop.0+0x2>: nopl (%rax)
<mwait_idle_with_hints.constprop.0+0x5>: jmp 0xffffffff936dabac <mwait_idle_with_hints.constprop.0+0x1c>
<mwait_idle_with_hints.constprop.0+0x7>: nopl (%rax)
<mwait_idle_with_hints.constprop.0+0xa>: mfence
<mwait_idle_with_hints.constprop.0+0xd>: mov %gs:0x17bc0,%rax
<mwait_idle_with_hints.constprop.0+0x16>: clflush (%rax)
<mwait_idle_with_hints.constprop.0+0x19>: mfence
<mwait_idle_with_hints.constprop.0+0x1c>: xor %edx,%edx
<mwait_idle_with_hints.constprop.0+0x1e>: mov %rdx,%rcx
<mwait_idle_with_hints.constprop.0+0x21>: mov %gs:0x17bc0,%rax
<mwait_idle_with_hints.constprop.0+0x2a>: monitor %rax,%rcx,%rdx
<mwait_idle_with_hints.constprop.0+0x2d>: mov (%rax),%rax
<mwait_idle_with_hints.constprop.0+0x30>: test $0x8,%al
<mwait_idle_with_hints.constprop.0+0x32>: jne 0xffffffff936dabdb <mwait_idle_with_hints.constprop.0+0x4b>
<mwait_idle_with_hints.constprop.0+0x34>: jmpq 0xffffffff936dabd0 <mwait_idle_with_hints.constprop.0+0x40>
<mwait_idle_with_hints.constprop.0+0x39>: verw 0x9f9fec(%rip) # 0xffffffff940d4bbc
<mwait_idle_with_hints.constprop.0+0x40>: mov $0x1,%ecx
<mwait_idle_with_hints.constprop.0+0x45>: mov %rdi,%rax
<mwait_idle_with_hints.constprop.0+0x48>: mwait %rax,%rcx
<mwait_idle_with_hints.constprop.0+0x4b>: mov %gs:0x17bc0,%rax
<mwait_idle_with_hints.constprop.0+0x54>: lock andb $0xdf,0x2(%rax)
<mwait_idle_with_hints.constprop.0+0x59>: lock addl $0x0,-0x4(%rsp)
<mwait_idle_with_hints.constprop.0+0x5f>: mov (%rax),%rax
<mwait_idle_with_hints.constprop.0+0x62>: test $0x8,%al
<mwait_idle_with_hints.constprop.0+0x64>: je 0xffffffff936dac01 <mwait_idle_with_hints.constprop.0+0x71>
<mwait_idle_with_hints.constprop.0+0x66>: andl $0x7fffffff,%gs:0x6c93cf7f(%rip) # 0x17b80
<mwait_idle_with_hints.constprop.0+0x71>: retq
<mwait_idle_with_hints.constprop.0+0x72>: mov %gs:0x17bc0,%rax
<mwait_idle_with_hints.constprop.0+0x7b>: lock orb $0x20,0x2(%rax)
<mwait_idle_with_hints.constprop.0+0x80>: mov (%rax),%rax
<mwait_idle_with_hints.constprop.0+0x83>: test $0x8,%al
<mwait_idle_with_hints.constprop.0+0x85>: jne 0xffffffff936dabdb <mwait_idle_with_hints.constprop.0+0x4b>
<mwait_idle_with_hints.constprop.0+0x87>: jmpq 0xffffffff936dab95 <mwait_idle_with_hints.constprop.0+0x5>
<mwait_idle_with_hints.constprop.0+0x8c>: nopl 0x0(%rax)

0x4b is after the mwait, which means we're panicing in the current_clr_polling(), where we do clear_thread_flag(TIF_POLLING_NRFLAG). Thanks,

Josef