Re: power-off delay/hang due to commit 6d25be57 (mainline)

From: Stephen Berman
Date: Mon Jun 15 2020 - 11:41:30 EST


On Mon, 15 Jun 2020 16:51:30 +0200 Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> wrote:

> On 2020-06-15 09:58:00 [+0200], Stephen Berman wrote:
>> Ok, sorry, I had misunderstood, but now I've looked at the
>> documentation. I had in fact already done `echo t >
>> /proc/sysrq-trigger' in an xterm (as root) and there was no output.
>> Later, after booting kernel 5.1.0 because of the message flooding with
>> 5.6.4, I did `echo t > /proc/sysrq-trigger' in a virtual tty (also as
>> root) and the only output was: `[ <timestamp>] sysrq: Show State'. Is
>> this expected? (In /proc/sys/kernel/sysrq there is `1'.) I couldn't
>> try it in a 5.6.4 virtual tty because of the message flooding (see
>> below).
>
> If you do this "t" then there should be a lot of output on your console.
> If you do this from an xterm then you can see the output after typing
> "dmesg". The output should appear also in your system log.

Ah, ok, I do see it in the log, it looks basically the same as the call
trace I posted upthread. I wonder why there was no such output in the
console, could there be some setting that suppresses it even though
/proc/sys/kernel/sysrq has `1'?

>> The kernel log shows 305 of these messages in the 4 minutes and 17
>> seconds between the start of klogd and when I rebooted.
>
> Okay, this is a lot.
>
>> > The interesting part is to see if there is a acpi_os_execute() adding a
>> > specific event multiple times which does not complete. Maybe at runtime,
>> > maybe at shutdown time. If that is the case then ignoring this specific
>> > event might fix the shutdown problem. With all this information so far,
>> > I don't see a relation with this problem and the commitâ
>>
>> In those 3 minutes and 8 seconds there were 5 "Adding
>> acpi_ev_asynch_enable_gpe" messages at the beginning, then 185 "Adding
>> acpi_ev_notify_dispatch" messages, which kept coming until I rebooted.
>
> 5 acpi_ev_asynch_enable_gpe() in three minutes since boot isn't much.
> 185 of "Adding acpi_ev_notify_dispatch" is a lot.
>
> But to be clear, only
> acpi_os_execute(1109) Adding acpi_ev_notify_dispatch" messages?
> No
> "acpi_os_execute_deferred_notify() Start/End acpi_ev_notify_dispatch"?
>
> The syslog should have captured the log on disk.

There were in fact 99 "Start .* acpi_ev_notify_dispatch" messages and 98
"End .* acpi_ev_notify_dispatch" messages. Here's the last of them
before I rebooted:

Jun 15 08:58:25 strobe-jhalfs kernel: [ 193.315014] acpi_os_execute(1109) Adding acpi_ev_notify_dispatch+0x0/0x55 ffff8d7aa99a66e0 <ffff8d7aabb38480>
Jun 15 08:58:25 strobe-jhalfs kernel: [ 193.319929] acpi_os_execute_deferred_notify(853) End ffff8d7aa8758d00 acpi_ev_notify_dispatch+0x0/0x55(ffff8d7aa84e7eb0)
Jun 15 08:58:25 strobe-jhalfs kernel: [ 193.321242] acpi_os_execute_deferred_notify(851) Start ffff8d7aa8758a80 acpi_ev_notify_dispatch+0x0/0x55(ffff8d7aa84e70a0)
Jun 15 08:58:26 strobe-jhalfs kernel: [ 194.339017] acpi_os_execute(1109) Adding acpi_ev_notify_dispatch+0x0/0x55 ffff8d7aa99a6730 <ffff8d7aabb384c0>
Jun 15 08:58:27 strobe-jhalfs kernel: [ 195.363005] acpi_os_execute(1109) Adding acpi_ev_notify_dispatch+0x0/0x55 ffff8d7aa99a6780 <ffff8d7aabb38500>
Jun 15 08:58:27 strobe-jhalfs kernel: [ 195.367978] acpi_os_execute_deferred_notify(853) End ffff8d7aa8758a80 acpi_ev_notify_dispatch+0x0/0x55(ffff8d7aa84e70a0)
Jun 15 08:58:27 strobe-jhalfs kernel: [ 195.369234] acpi_os_execute_deferred_notify(851) Start ffff8d7aa8758240 acpi_ev_notify_dispatch+0x0/0x55(ffff8d7aa84e7a00)
Jun 15 08:58:28 strobe-jhalfs kernel: [ 196.387017] acpi_os_execute(1109) Adding acpi_ev_notify_dispatch+0x0/0x55 ffff8d7aa99a68c0 <ffff8d7aabb38540>
Jun 15 08:58:28 strobe-jhalfs kernel: [ 196.388508] acpi_os_execute_deferred_notify(853) End ffff8d7aa8758240 acpi_ev_notify_dispatch+0x0/0x55(ffff8d7aa84e7a00)
Jun 15 08:58:28 strobe-jhalfs kernel: [ 196.390128] acpi_os_execute_deferred_notify(851) Start ffff8d7aa8758b80 acpi_ev_notify_dispatch+0x0/0x55(ffff8d7aa84e7460)
Jun 15 08:58:29 strobe-jhalfs kernel: [ 197.208714] wlan1: deauthenticating from 7c:ff:4d:08:df:22 by local choice (Reason: 3=DEAUTH_LEAVING)
Jun 15 08:58:29 strobe-jhalfs kernel: [ 197.232214] ip (4614) used greatest stack depth: 11272 bytes left
Jun 15 08:58:29 strobe-jhalfs kernel: Kernel logging (proc) stopped.
Jun 15 08:58:29 strobe-jhalfs kernel: Kernel log daemon terminating.

> I attached a modified acpi_dbg.patch. Please enable:
> - CONFIG_ACPI_DEBUG=y
>
> Looking at your 5.1 you have tracing enabled (hope it still is).
>
> The attached patch will dump the date into the tracing buffer, so you
> console should remain "clean". Once it records 300 of those "requests
> for acpi_ev_notify_dispatch" it will stop recording.
> After 4-5 minutes please do
> cat /sys/kernel/debug/tracing/trace > file.txt
>
> compress it and send it.

Will do as soon as I can, hopefully later today or tomorrow. Thanks.

Steve Berman