Re: Suspend-resume failure on Intel Eagle Lake Core2Duo

From: Tomi Sarvela
Date: Mon Jul 31 2017 - 03:22:34 EST


On 28/07/17 19:26, Thomas Gleixner wrote:
On Fri, 28 Jul 2017, Tomi Sarvela wrote:
On 28/07/17 17:50, Thomas Gleixner wrote:
On Fri, 28 Jul 2017, Tomi Sarvela wrote:
On 28/07/17 17:13, Thomas Gleixner wrote:
On Fri, 28 Jul 2017, Tomi Sarvela wrote:
On 28/07/17 16:15, Thomas Gleixner wrote:
Another question. Is the machine completely dead or not?

Completely dead. Powerled is on, so host isn't shut down.

So that means it does not even power the machine down. That's what I
expected least.

Serial or network if don't give any signs of life.

Patch applies cleanly but still getting the same error:

Sorry for the noise. I'm an idiot trying to do 10 things at once. This
time
it actually compiles and links.

If the machine does still not powerdown with this applied, then please
redo
the 'platform' test and grab the trace for that one.

This patch fixes the issue. Below is the dmesg from the testrun (sorry for
the spam, we're primarily testing i915 issues).

Can you please retrieve the trace data from:

/sys/kernel/debug/tracing/trace

and provide that. The dmesg does not help much.

Right, here you go.

Thanks for providing the data. Just to be sure, that data was from a real
suspend, not the 'platform' test, right?

If so, that does not make any sense at all. The patch merily changes the
enable/resume path and adds the debug trace printks which have no influence
on the disable logic. But you said that the machine does not power off in
the bad case. That does not make any sense at all as the enable logic is
not involved at all in the suspend path.

Did you change anything else compared to the tests before ?

I did check that the problem persisted in linus-HEAD before testing your patch. The testing was done in order (reading from console logs I happen to still have in one window):

- reboot to patched kernel

[ 0.000000] Linux version 4.13.0-rc2+ (testrunner@elk) (gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)) #5 SMP PREEMPT Fri Jul 28 17:15:47 EEST 2017
[ 0.000000] Command line: BOOT_IMAGE=/boot/tsa.efi root=/dev/sda1 console=ttyS0,115200n8 console=tty0 intel_iommu=igfx_off drm.debug=0xe nmi_watchdog=panic,auto panic=1 softdog.soft_panic=1 scsi_mod.use_blk_mq=0 rootwait ro 3

- "real" 15sec suspend test through IGT/piglit and rtcwake

$ ./scripts/run-tests.sh -vt igt@gem_exec_suspend@basic-s3 -x devices

- dmesg to go with suspend

[ 1189.597665] Suspended for 14.825 seconds
[ 1189.597665] Delta way too big! 18446743991837909721 ts=18446744056274518328 write stamp = 64436608607
If you just came from a suspend/resume,
please switch to the trace global clock:
echo global > /sys/kernel/debug/tracing/trace_clock
[ 1189.597665] ------------[ cut here ]------------
[ 1189.597665] WARNING: CPU: 0 PID: 1332 at kernel/trace/ring_buffer.c:2647 rb_handle_timestamp.isra.32+0x71/0x80
[ 1189.597665] Modules linked in: snd_hda_codec_realtek i915 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core coretemp snd_pcm e1000e lpc_ich mei_me ptp mei pps_core
[ 1189.597665] CPU: 0 PID: 1332 Comm: rtcwake Tainted: G U 4.13.0-rc2+ #5
[ 1189.597665] Hardware name: Hewlett-Packard HP Compaq 8000 Elite CMT PC/3647h, BIOS 786G7 v01.13 07/20/2011
[ 1189.597665] task: ffff88010dce9880 task.stack: ffffc900000c8000
[ 1189.597665] RIP: 0010:rb_handle_timestamp.isra.32+0x71/0x80
[ 1189.597665] RSP: 0018:ffffc900000cbab0 EFLAGS: 00010082
[ 1189.597665] RAX: 00000000000000e0 RBX: ffffc900000cbad0 RCX: 0000000000000004
[ 1189.597665] RDX: 0000000080000004 RSI: 0000000000000082 RDI: 00000000ffffffff
[ 1189.597665] RBP: ffffc900000cbac0 R08: 0000000000000000 R09: 00000000000000e0
[ 1189.597665] R10: ffffc900000cbbe0 R11: 00000000000ebc08 R12: ffff880117008890
[ 1189.597665] R13: ffff8801170088b0 R14: 00000000000003e8 R15: 0000000000000005
[ 1189.597665] FS: 00007f93d957e700(0000) GS:ffff88011bc00000(0000) knlGS:0000000000000000
[ 1189.597665] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1189.597665] CR2: 000000f00255b068 CR3: 00000000d9063000 CR4: 00000000000406f0
[ 1189.597665] Call Trace:
[ 1189.597665] ring_buffer_lock_reserve+0x1fa/0x350
[ 1189.597665] trace_vbprintk+0xdc/0x260
[ 1189.597665] ? __irq_disable+0x1b/0xc0
[ 1189.597665] __trace_bprintk+0x4a/0x60
[ 1189.597665] ? preempt_count_add+0x9e/0xb0
[ 1189.597665] __irq_disable+0x3f/0xc0
[ 1189.597665] irq_disable+0x17/0x20
[ 1189.597665] __disable_irq_nosync+0x59/0x70
[ 1189.597665] disable_hardirq+0x11/0x30
[ 1189.597665] hpet_msi_resume+0x85/0xd0
[ 1189.597665] clockevents_tick_resume+0x14/0x20
[ 1189.597665] tick_resume_local+0x32/0x60
[ 1189.597665] tick_resume+0x13/0x20
[ 1189.597665] timekeeping_resume+0x149/0x1a0
[ 1189.597665] syscore_resume+0x4b/0x190
[ 1189.597665] ? syscore_resume+0x4b/0x190
[ 1189.597665] suspend_devices_and_enter+0x6b9/0x810
[ 1189.597665] pm_suspend+0x367/0x540
[ 1189.597665] state_store+0x7e/0xf0
[ 1189.597665] kobj_attr_store+0xf/0x20
[ 1189.597665] sysfs_kf_write+0x37/0x40
[ 1189.597665] kernfs_fop_write+0x110/0x1a0
[ 1189.597665] __vfs_write+0x28/0x130
[ 1189.597665] ? __this_cpu_preempt_check+0x13/0x20
[ 1189.597665] ? __sb_start_write+0x55/0xe0
[ 1189.597665] vfs_write+0xb6/0x1a0
[ 1189.597665] SyS_write+0x46/0xb0
[ 1189.597665] entry_SYSCALL_64_fastpath+0x17/0x98
[ 1189.597665] RIP: 0033:0x7f93d90ac8f0
[ 1189.597665] RSP: 002b:00007ffd80a9ca88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 1189.597665] RAX: ffffffffffffffda RBX: 000000f00255a050 RCX: 00007f93d90ac8f0
[ 1189.597665] RDX: 0000000000000004 RSI: 000000f00255a060 RDI: 0000000000000007
[ 1189.597665] RBP: 00007f93d9375b00 R08: 000000f002557d90 R09: 00007f93d957e700
[ 1189.597665] R10: 00007f93d9375b58 R11: 0000000000000246 R12: 00007f93d9375b58
[ 1189.597665] R13: 00007f93d9375b58 R14: 000000000000270f R15: 0000000000001010
[ 1189.597665] Code: f0 48 8b 73 08 85 c0 48 8b 13 48 c7 c0 88 9a a5 81 49 c7 c0 1a 3b a8 81 4c 0f 44 c0 48 8b 0f 48 c7 c7 10 9b a5 81 e8 c0 c8 fa ff <0f> ff eb a7 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 56
[ 1189.597665] ---[ end trace 7d99ac836f161565 ]---

- printing out the trace from /sys/kernel/debug/tracing/trace


Tomi
--
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo