Re: x86 memcpy performance

From: Borislav Petkov
Date: Mon Aug 15 2011 - 11:30:04 EST

On Mon, 15 August, 2011 4:59 pm, Andy Lutomirski wrote:
>>> So what is the reason we cannot use sse_memcpy in interrupt context.
>>> (fpu registers not saved ? )
>> Because, AFAICT, when we handle an #NM exception while running
>> sse_memcpy in an IRQ handler, we might need to allocate FPU save state
>> area, which in turn, can sleep. Then, we might get another IRQ while
>> sleeping and we should be deadlocked.
>> But let me stress on the "AFAICT" above, someone who actually knows the
>> FPU code should correct me if I'm missing something.
> I don't think you ever get #NM as a result of kernel_fpu_begin, but you
> can certainly have problems when kernel_fpu_begin nests by accident.
> There's irq_fpu_usable() for this.
> (irq_fpu_usable() reads cr0 sometimes and I suspect it can be slow.)

Oh I didn't know about irq_fpu_usable(), thanks.

But still, irq_fpu_usable() still checks !in_interrupt() which means
that we don't want to run SSE instructions in IRQ context. OTOH, we
still are fine when running with CR0.TS. So what happens when we get an
#NM as a result of executing an FPU instruction in an IRQ handler? We
will have to do init_fpu() on the current task if the last hasn't used
math yet and do the slab allocation of the FPU context area (I'm looking
at math_state_restore, btw).



To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at