Re: crash in entry.S restore_all, 2.6.12-rc2, x86, PAGEALLOC

From: Linus Torvalds
Date: Thu Apr 07 2005 - 09:48:53 EST




On Thu, 7 Apr 2005, Andrew Morton wrote:
>
> Ingo Molnar <mingo@xxxxxxx> wrote:
> >
> > --- linux/arch/i386/kernel/entry.S.orig
> > +++ linux/arch/i386/kernel/entry.S
> > @@ -179,12 +179,17 @@ need_resched:
> > ENTRY(sysenter_entry)
> > movl TSS_sysenter_esp0(%esp),%esp
> > sysenter_past_esp:
> > - sti
> > + #
> > + # irqs are disabled: set up an entry stackframe without
> > + # allowing irqs to potentially preempt us with an
> > + # incomplete entry frame!
> > + #
> > pushl $(__USER_DS)
> > pushl %ebp
> > pushfl
> > pushl $(__USER_CS)
> > pushl $SYSENTER_RETURN
> > + sti
> >
>
> Well that bites the big one.
>>
> Program received signal SIGTRAP, Trace/breakpoint trap.
> 0xc015c4ba in __find_get_block (bdev=0xc18cd988, block=2818614, size=4096) at fs/buffer.c:1371
> 1371 BUG_ON(irqs_disabled());

Yes. With the change in place "entry.S" will always save the wrogn EIP, so
we'll return with interrupts disabled.

Your suggested patch is pretty horrid, though. It's sufficient to enable
interrupts after the two first pushes, since that has already now set up
enough of a kernel stack that any subsequent interrupt will always have at
least a "fake" SS/ESP (ie some values there, although not anything
relevant).

So the sysenter sequence might as well look like

pushl $(__USER_DS)
pushl %ebp
sti
pushfl
..

which actually does three protected pushes thanks to the one-instruction
"interrupt shadow" after an sti.

Another alternative (and to some degree a maybe a better one) is to not
use "pushfl" at all in the sysenter sequence, but instead just push a
fixed default value. At that point, the "sti" can stay where it was.

After all, I very strongly suspect that we don't actually really _care_ if
eflags stays the same over a system call, and I could see that some
dynamic CPU's might prefer not having to push an eflags value that just
got changed by the "sti", so it _might_ save several cycles to avoid that
dependency, and also obviously avoid a subtle dependency on a sw level
that the previous patch clearly introduced.

Anybody willing to time it? ;)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/