Re: BUG: unable to handle kernel paging request at ffffe8ff7fc00001

From: Thomas Gleixner
Date: Mon Nov 16 2015 - 03:54:19 EST


On Sun, 15 Nov 2015, Linus Torvalds wrote:
> On Sun, Nov 15, 2015 at 2:28 PM, Kyle Sanderson <kyle.leet@xxxxxxxxx> wrote:
> > [] BUG: unable to handle kernel paging request at ffffe8ff7fc00001
> > [] IP: [<ffffffff810a174f>] kstat_irqs+0x4f/0x90
> > [] CPU: 2 PID: 1078 Comm: usage.pl Not tainted 4.1.7-hardened-r1 #1
> > [] Hardware name: Supermicro Super Server/X10SRi-F, BIOS 1.0b 04/21/2015
> RSI: 000060f700000001
> > [] Call Trace:
> > [] [<>] kstat_irqs_usr+0x1e/0x40

> The code ends up being
>
> mov 0x48(%r13),%rsi
> mov __per_cpu_offset(,%rcx,8),%rcx
> add (%rsi,%rcx,1),%ebx <-- trapping instruction
>
> which is just the
>
> sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
>
> part of kstat_irqs().
>
> Your registers being
>
> RSI: 000060f700000001
> RCX: ffff88087fc00000
>
> and it's RSI that makes no sense - RCX looks like a real kernel
> pointer. So it looks like it's the "desc->kstat_irqs" thing that is
> for some reason garbage.
>
> I don't see any sane possible reason this would happen, though.
> Thomas, does this look like anything you've seen before?

No. What's strange is that this does explode while reading
/proc/interrupts and it did not happen when interrupt accounting took
place.

Though this looks like memory corruption and it might be an interrupt
which fired only at boot time, i.e. before the corruption happened.

No idea how to decode that. Kyle, is that reproducible?

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/