Re: 2.6.17-rc2-mm1

From: Andi Kleen
Date: Wed May 03 2006 - 02:49:37 EST


On Wednesday 03 May 2006 08:47, Jan Beulich wrote:
> >>> Andi Kleen <ak@xxxxxxx> 02.05.06 22:09 >>>
> >On Tuesday 02 May 2006 22:00, Martin Bligh wrote:
> >
> >> > Index: linux/arch/x86_64/kernel/traps.c
> >> > ===================================================================
> >> > --- linux.orig/arch/x86_64/kernel/traps.c
> >> > +++ linux/arch/x86_64/kernel/traps.c
> >> > @@ -238,6 +238,7 @@ void show_trace(unsigned long *stack)
> >> > HANDLE_STACK (stack < estack_end);
> >> > i += printk(" <EOE>");
> >> > stack = (unsigned long *) estack_end[-2];
> >> > + printk("new stack %lx (%lx %lx %lx %lx %lx)\n", stack, estack_end[0], estack_end[-1],
> estack_end[-2], estack_end[-3], estack_end[-4]);
> >> > continue;
> >> > }
> >> > if (irqstack_end) {
> >>
> >> Thanks for running this Andy:
> >>
> >> http://test.kernel.org/abat/30183/debug/console.log
> >
> >
> ><EOE>new stack 0 (0 0 0 10082 10)
>
> Looks like <rubbish> <SS> <RSP> <RFLAGS> <CS> to me, ...

Hmm, right.

> >Hmm weird. There isn't anything resembling an exception frame at the top of the
> >stack. No idea how this could happen.
>
> ... which is a valid frame where the stack pointer was corrupted before the exception occurred. One more printed item
> (or rather, starting items at estack_end[-1]) would allow at least seeing what RIP this came from.

Any can you add that please and check?

Also worst case one could dump last branch pointers. AMD unfortunately only has four,
on Intel with 16 it's easier.

I can provide a patch for that if needed.

> This actually points out another weakness of that code: if you pick up a mis-aligned stack pointer then the conditions
> in both the exception and interrupt stack invocations of HANDLE_STACK() won't prevent you from accessing an item
> crossing a page boundary, and hence potentially faulting.

Yes it probably should check for that.

> Similarly, obtaining an entirely bad stack pointer anywhere in
> that code will result in a fault. I guess the stack reads should really be done using get_user() or some other code
> having recovery attached.

That can cause recursive exceptions. I'm a bit paranoid with that.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/