Re: Dcache oops

From: Al Viro
Date: Fri Jun 03 2016 - 16:08:23 EST

On Fri, Jun 03, 2016 at 02:35:41PM -0400, Oleg Drokin wrote:

> >> [ 2642.364383] BUG: unable to handle kernel paging request at ffff880113f82000
> >> [ 2642.365014] IP: [<ffffffff817f87d4>] bad_gs+0xd1d/0x1ba9
> >
> > *ow*
> > Could you dump your vmlinux (and somewhere on anonftp?
> > This 'bad_gs' is there simply because it's one of the few labels in
> > .fixup - to say anything useful we'll need to find out where we'd
> > really come from.
> I see.
> vmlinux with debug symbols:

ffffffff817f87cd: 48 8d 0a lea (%rdx),%rcx
ffffffff817f87d0: 48 83 e1 f8 and $0xfffffffffffffff8,%rcx
ffffffff817f87d4: 4c 8b 01 mov (%rcx),%r8
ffffffff817f87d7: 8d 0a lea (%rdx),%ecx
ffffffff817f87d9: 83 e1 07 and $0x7,%ecx
ffffffff817f87dc: c1 e1 03 shl $0x3,%ecx
ffffffff817f87df: 49 d3 e8 shr %cl,%r8
ffffffff817f87e2: e9 9b b3 a4 ff jmpq ffffffff81243b82 <__d_lookup+0x132>

Aha... It's load_unaligned_zeropad() from dentry_string_cmp(), hitting
a genuinely unmapped address. That sends it into fixup, where it tries to
load an aligned word containing the address in question, in hope that
fault was on attempt to cross into the next page. No such luck, address
was aligned in the first place (it's in %rdx - 0xffff880113f82000), so
we still oops.

The unexpected part is that unmapped address did *NOT* come from a dentry;
it's .name of qstr we were looking for. And your call chain was
__d_lookup() <- d_lookup() <- lookup_open(), so in lookup_open() it was

Can the same thing be reproduced (with NFS fix) on v4.6, ede4090, 7f427d3,