Re: [PATCH 1/1] x86/fpu: math_state_restore() should not blindly disable irqs

From: Linus Torvalds
Date: Fri Mar 06 2015 - 12:33:52 EST


On Thu, Mar 5, 2015 at 11:58 PM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> math_state_restore() was historically called with irqs disabled,
> because that's how the hardware generates the trap, and also because
> back in the days it was possible for it to be an asynchronous
> interrupt and interrupt handlers run with irqs off.
>
> These days it's always an instruction trap, and furthermore it does
> inevitably complex things such as memory allocation and signal
> processing, which is not done with irqs disabled.
>
> So keep irqs enabled.

I agree with the "keep irqs enabled".

However, I do *not* agree with the actual patch, which doesn't do that at all.
> @@ -844,8 +844,9 @@ void math_state_restore(void)
> {
> struct task_struct *tsk = current;
>
> + local_irq_enable();
> +

There's a big difference between "keep interrupts enabled" (ok) and
"explicitly enable interrupts in random contexts" (*NOT* ok).

So get rid of the "local_irq_enable()" entirely, and replace it with a

WARN_ON_ONCE(irqs_disabled());

and let's just fix the cases where this actually gets called with
interrupts off. In particular, let's just make the
device_not_available thing use a trap gate, not an interrupt gate. And
then remove the "conditional_sti()" stuff.

IOW, I think the starting point should be something like the attached
(which doesn't do the WARN_ON_ONCE() - it should be added for
debugging).

*NOT* some kind of "let's just enable interrupts blindly" approach.

This is completely untested, of course. But I don't see why we should
use an interrupt gate for this. Is there anything in
"exception_enter()" that requires it?

Linus
arch/x86/kernel/traps.c | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 9d2073e2ecc9..f045ac026ff1 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -836,16 +836,12 @@ asmlinkage __visible void __attribute__((weak)) smp_threshold_interrupt(void)
*
* Careful.. There are problems with IBM-designed IRQ13 behaviour.
* Don't touch unless you *really* know how it works.
- *
- * Must be called with kernel preemption disabled (eg with local
- * local interrupts as in the case of do_device_not_available).
*/
void math_state_restore(void)
{
struct task_struct *tsk = current;

if (!tsk_used_math(tsk)) {
- local_irq_enable();
/*
* does a slab alloc which can sleep
*/
@@ -856,7 +852,6 @@ void math_state_restore(void)
do_group_exit(SIGKILL);
return;
}
- local_irq_disable();
}

/* Avoid __kernel_fpu_begin() right after __thread_fpu_begin() */
@@ -884,18 +879,13 @@ do_device_not_available(struct pt_regs *regs, long error_code)
if (read_cr0() & X86_CR0_EM) {
struct math_emu_info info = { };

- conditional_sti(regs);
-
info.regs = regs;
math_emulate(&info);
exception_exit(prev_state);
return;
}
#endif
- math_state_restore(); /* interrupts still off */
-#ifdef CONFIG_X86_32
- conditional_sti(regs);
-#endif
+ math_state_restore();
exception_exit(prev_state);
}
NOKPROBE_SYMBOL(do_device_not_available);
@@ -959,7 +949,7 @@ void __init trap_init(void)
set_system_intr_gate(X86_TRAP_OF, &overflow);
set_intr_gate(X86_TRAP_BR, bounds);
set_intr_gate(X86_TRAP_UD, invalid_op);
- set_intr_gate(X86_TRAP_NM, device_not_available);
+ set_trap_gate(X86_TRAP_NM, device_not_available);
#ifdef CONFIG_X86_32
set_task_gate(X86_TRAP_DF, GDT_ENTRY_DOUBLEFAULT_TSS);
#else