Re: crash in entry.S restore_all, 2.6.12-rc2, x86, PAGEALLOC

From: Ingo Molnar
Date: Thu Apr 07 2005 - 03:17:24 EST



* Stas Sergeev <stsp@xxxxxxxx> wrote:

> ENTRY(sysenter_entry)
> movl TSS_sysenter_esp0(%esp),%esp
> sysenter_past_esp:
> - sti
> pushl $(__USER_DS)
> pushl %ebp
> + sti

ah, yes, sysenter. SYSENTER creates a degenerate 'small' stackframe with
an esp0 that is missing the 5 entry words relative to the normal entry
(int80 or irq) esp0 stackframe. These 5 words are: xss, esp, eflags,
xcs, eip. The sysenter code sets them up manually.

now if an interrupt hits at this point, it will set up a 'same privilege
level' stackframe, which has eip/xcs/eflags, i.e. no esp/xss. If upon
irq-return we then examine the stack due to your patch, it will be an
incorrect stackframe -> kaboom.

your patch doesnt remove the condition, it only removes the crash,
because it adds the 2 words space that is needed - but the information
relied on by your irq-return test is still bogus. At this point i'd
suggest to remove the ESP patch altogether.

the correct solution is to always let the sysenter path set up a full
and correct stackframe, before allowing preemption (see the attached
patch). This was a nasty bug in the waiting. (I have not made this
conditional on CONFIG_PREEMPT, to keep it simple and because the impact
to irq latency is small and predictable. There's no runtime overhead.)

so i think with the help of Stas the mystery has been fully explained
and solved. Linus?

Ingo

Signed-off-by: Ingo Molnar <mingo@xxxxxxx>

--- linux/arch/i386/kernel/entry.S.orig
+++ linux/arch/i386/kernel/entry.S
@@ -179,12 +179,17 @@ need_resched:
ENTRY(sysenter_entry)
movl TSS_sysenter_esp0(%esp),%esp
sysenter_past_esp:
- sti
+ #
+ # irqs are disabled: set up an entry stackframe without
+ # allowing irqs to potentially preempt us with an
+ # incomplete entry frame!
+ #
pushl $(__USER_DS)
pushl %ebp
pushfl
pushl $(__USER_CS)
pushl $SYSENTER_RETURN
+ sti

/*
* Load the potential sixth argument from user stack.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/