Re: GPF in __d_lookup_rcu after hibernate

From: Al Viro
Date: Sat Mar 19 2016 - 15:32:20 EST


On Sat, Mar 19, 2016 at 12:46:04PM -0400, Johan Hovold wrote:
> Hi,
>
> After updating to 4.4.5 I keep hitting a GPF in __d_lookup_rcu after
> resuming from suspend-to-disk:

[snip]

Hash chain leading into 0x000b0000000a0000, which is not a valid pointer.

> 4.4.6 has the same problem as can be seen above, and I just discovered I
> had saved a log from 4.3.4 which also appears to suffer from this:

Hash chain leading into 0x009b0000009a0000, which is not a valid pointer.

> When this happens only a forced cold boot appears to make the machine
> usable again (further look-ups keep failing).
>
> Looking at these logs now I realised I had reloaded the uvcvideo module so
> to be sure I just suspended without touching that module and hit another
> GFP when relaunching firefox after resume:

> [ 2741.976850] general protection fault: 0000 [#1] PREEMPT SMP
> [ 2741.976957] Modules linked in: intel_rapl iosf_mbi uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core
> [ 2741.977159] CPU: 3 PID: 8199 Comm: firefox Not tainted 4.4.6 #130
> [ 2741.977256] Hardware name: SAMSUNG ELECTRONICS CO., LTD. 940X3G/NP940X3G-K03SE, BIOS P02ACJ.101.130926.dg 09/26/2013
> [ 2741.977425] task: ffff8801edcdaf00 ti: ffff8800d41b0000 task.ti: ffff8800d41b0000
> [ 2741.977544] RIP: 0010:[<ffffffff8119d840>] [<ffffffff8119d840>] kernfs_iop_follow_link+0x70/0x1a0
> [ 2741.977696] RSP: 0018:ffff8800d41b3e98 EFLAGS: 00010286
> [ 2741.977780] RAX: ffff8801edcdaf00 RBX: ffff88021687e4b0 RCX: 0000000000000000
> [ 2741.977893] RDX: 008f0000008e0000 RSI: 0000000000000000 RDI: ffffffff81c324e0
> [ 2741.978007] RBP: ffff8800d53b6000 R08: ffffffff81ab6c29 R09: ffffea000354edc0
> [ 2741.978120] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800d53b6000
> [ 2741.978233] R13: 00007ffd3ccaed80 R14: ffff8800da802870 R15: 008f0000008e0000
> [ 2741.978347] FS: 00007ff343d65780(0000) GS:ffff88021fb80000(0000) knlGS:0000000000000000
> [ 2741.978475] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2741.978573] CR2: 00007ff343d3f070 CR3: 00000000da8be000 CR4: 00000000001406e0
> [ 2741.978665] Stack:
> [ 2741.978698] ffff8800d41b3ed8 ffff8802148f98c0 00007ff3429314d0 0000000000000063
> [ 2741.978804] 00007ffd3ccaed80 ffff8802148f98c0 00007ff3429314d0 ffffffff81147926
> [ 2741.978908] 00000000ffffff9c 00000000ffffffea 0000000000004000 00000000ffffff9c
> [ 2741.979012] Call Trace:
> [ 2741.979047] [<ffffffff81147926>] ? generic_readlink+0x56/0x70
> [ 2741.979126] [<ffffffff8114366d>] ? SyS_readlinkat+0x8d/0x100
> [ 2741.979203] [<ffffffff817a0b57>] ? entry_SYSCALL_64_fastpath+0x12/0x6a
> [ 2741.979289] Code: 24 c3 81 4c 89 e5 48 8b 58 08 4c 8b 70 40 e8 38 17 60 00 48 83 7b 08 00 74 35 4d 8b 7e 08 4c 89 fa eb 08 48 39 da 74 2b 48 89 ca <48> 8b 4a 08 48 85 c9 75 ef 48 39 d3 74 1a c7 45 00 2e 2e 2f 00
> [ 2741.979650] RIP [<ffffffff8119d840>] kernfs_iop_follow_link+0x70/0x1a0
> [ 2741.979723] RSP <ffff8800d41b3e98>
> [ 2742.000022] ---[ end trace 264065d347b49d9e ]---
>
> I found the "fs: NULL deref in atime_needs_update" thread
>
> https://lkml.kernel.org/r/20160228170133.GM17997@xxxxxxxxxxxxxxxxxx
>
> after a quick search but can't say if its related.

Shouldn't be. It's already past the pathname resolution; by the time
we hit generic_readlink(), we are through with pathname handling.

Hard to tell without your .config, but at a guess that's
while (kn->parent && base != kn)
kn = kn->parent;
in kernfs_get_target_path() running into kn equal to 0x008f0000008e0000,
which is not a valid pointer.

Note that all of those are of the same pattern:
00 00 N 00 00 00 N+1 00
where a pointer should've been. In these traces we'd seen N equal to 0xa,
0x9a and 0x8e. Hell knows what it is, but the patterns are too similar to
be a coincidence; it's the same kind of memory corruption. Have it hit
a dentry and you've got yourself a persistent oops in dcache hash chain
traversals.

FWIW, it might be a single table of that form, with the previous pointer
in the chain corrupted so it points into it. Hell knows... AFAICS,
by that point the previous addresses are already lost, both in __d_lookup_rcu()
and kernfs_get_target_path() cases.

Might make sense to trawl for such values in the suspended images. Or,
if you really can reproduce it quickly enough, try to add
u64 v = (u64) node->next;

if (unlikely((v >> 32) == (u32)v + 0x10000)) {
int i;
printk(KERN_ERR "buggered at %p", dentry);
for (i = 0; i < sizeof(struct dentry)/4; i++)
printk(" %08x", ((int *)dentry)[i]);
printk("\n");
}
just before seqretry: label in __d_lookup_rcu() and try to trigger that thing.
Might give a better idea of what's going on...