Re: v4.20-rc1: list_del corruption on thinkpad x220

From: Joonas Lahtinen
Date: Wed Nov 21 2018 - 06:20:02 EST


+ Chris

Quoting Pavel Machek (2018-11-08 19:58:03)
> Hi!
>
> My machine locked hard (thinkpad x220). After reboot, I found this in
> syslog:
>
> Sounds like memory corruption..? Does not sound like easy to debug.

Were you doing something GPU intense when you experienced the hard hang?

And if so, have you been able to hit the issue more than once? At this
point it doesn't look like anything we've hit previously, so would be
great to have some more insight into how we could reproduce.

There's one similar for nouveau in Bugzilla, but it seems like a genuine
memory corruption (1 bit flipped):

https://bugs.freedesktop.org/show_bug.cgi?id=84880

Any extra information would be of use :)

Regards, Joonas

PS. Could you open a bug to Bugzilla, it'll help to collect the
information in one consolidated place:

https://01.org/linuxgraphics/documentation/how-report-bugs

>
> ...otoh, it still looks like an addres, so maybe it is "just" race in
> GPU drivers?
>
> Any ideas?
> Pavel
>
> Nov 8 18:35:01 duo CRON[28511]: (root) CMD (command -v debian-sa1 >
> /dev/null && debian-sa
> 1 1 1)
> Nov 8 18:42:57 duo kernel: list_del corruption. prev->next should be
> ffff8801742b8178, but
> was ffffc9000192fec8
> Nov 8 18:42:57 duo kernel: ------------[ cut here ]------------
> Nov 8 18:42:57 duo kernel: kernel BUG at
> /data/fast/l/k/lib/list_debug.c:53!
> Nov 8 18:42:57 duo kernel: invalid opcode: 0000 [#1] SMP PTI
> Nov 8 18:42:57 duo kernel: CPU: 2 PID: 1082 Comm: i915/signal:1 Not
> tainted 4.20.0-rc1+ #3
> Nov 8 18:42:57 duo kernel: Hardware name: LENOVO 42872WU/42872WU,
> BIOS 8DET74WW (1.44 ) 03
> /13/2018
> Nov 8 18:42:57 duo kernel: RIP:
> 0010:__list_del_entry_valid+0x8e/0x90
> Nov 8 18:42:57 duo kernel: Code: 66 88 d1 ff 0f 0b 48 89 fe 31 c0 48
> c7 c7 90 74 5e 85 e8
> 53 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 c8 74 5e 85 e8 40 88 d1 ff
> <0f> 0b 55 48 89 d0 48
> 8b 52 08 48 89 e5 48 39 f2 75 19 48 8b 32 48
> Nov 8 18:42:57 duo kernel: RSP: 0000:ffffc9000196be78 EFLAGS:
> 00210086
> Nov 8 18:42:57 duo kernel: RAX: 0000000000000054 RBX:
> ffff8801742b8178 RCX: 00000000000000
> 00
> Nov 8 18:42:57 duo kernel: RDX: 0000000000000000 RSI:
> ffff88019e2a53d8 RDI: ffff88019e2a53
> d8
> Nov 8 18:42:57 duo kernel: RBP: ffffc9000196be78 R08:
> ffff880196e2cd10 R09: 00000000000000
> 00
> Nov 8 18:42:57 duo kernel: R10: 00000000e7684eb9 R11:
> 3863656632393101 R12: ffffc9000196be
> c8
> Nov 8 18:42:57 duo kernel: R13: ffff88019707e000 R14:
> ffff8801742b8080 R15: ffffc9000192fd
> d0
> Nov 8 18:42:57 duo kernel: FS: 0000000000000000(0000)
> GS:ffff88019e280000(0000) knlGS:000
> 0000000000000
> Nov 8 18:42:57 duo kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033
> Nov 8 18:42:57 duo kernel: CR2: 00000000ed2bf000 CR3:
> 000000000581e001 CR4: 00000000000606a0
> Nov 8 18:42:57 duo kernel: Call Trace:
> Nov 8 18:42:57 duo kernel: intel_breadcrumbs_signaler+0x162/0x330
> Nov 8 18:42:57 duo kernel: kthread+0x116/0x150
> Nov 8 18:42:57 duo kernel: ? intel_engine_wakeup+0x40/0x40
> Nov 8 18:42:57 duo kernel: ? kthread_park+0x90/0x90
> Nov 8 18:42:57 duo kernel: ret_from_fork+0x35/0x40
> Nov 8 18:42:57 duo kernel: Modules linked in:
> Nov 8 18:42:57 duo kernel: ---[ end trace 2f8da183a56f80f6 ]---
> Nov 8 18:42:57 duo kernel: RIP:
> 0010:__list_del_entry_valid+0x8e/0x90
> Nov 8 18:42:57 duo kernel: Code: 66 88 d1 ff 0f 0b 48 89 fe 31 c0
> 48 c7 c7 90 74 5e 85 e8 53 88 d1 ff 0f 0b 48 89 fe 31 c0 48 c7 c7 c8
> 74 5e 85 e8 40 88 d1 ff <0f> 0b 55 48 89 d0 48 8b 52 08 48 89 e5 48
> 39 f2 75 19 48 8b 32 48
> Nov 8 18:42:57 duo kernel: RSP: 0000:ffffc9000196be78 EFLAGS:
> 00210086
> Nov 8 18:42:57 duo kernel: RAX: 0000000000000054 RBX:
> ffff8801742b8178 RCX: 0000000000000000
> Nov 8 18:42:57 duo kernel: RDX: 0000000000000000 RSI:
> ffff88019e2a53d8 RDI: ffff88019e2a53d8
> Nov 8 18:42:57 duo kernel: RBP: ffffc9000196be78 R08:
> ffff880196e2cd10 R09: 0000000000000000
> Nov 8 18:42:57 duo kernel: R10: 00000000e7684eb9 R11:
> 3863656632393101 R12: ffffc9000196bec8
> Nov 8 18:42:57 duo kernel: R13: ffff88019707e000 R14:
> ffff8801742b8080 R15: ffffc9000192fdd0
> Nov 8 18:42:57 duo kernel: FS: 0000000000000000(0000)
> GS:ffff88019e280000(0000) knlGS:0000000000000000
> Nov 8 18:42:57 duo kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033
> Nov 8 18:42:57 duo kernel: CR2: 00000000ed2bf000 CR3:
> 000000000581e001 CR4: 00000000000606a0
>
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html