Re: x86 memcpy performance

From: Andy Lutomirski
Date: Mon Aug 15 2011 - 10:59:38 EST

On 08/15/2011 10:55 AM, Borislav Petkov wrote:
On Mon, 15 August, 2011 3:27 pm, melwyn lobo wrote:
Was on a vacation for last two days. Thanks for the good insights into
the issue.
Ingo, unfortunately the data we have is on a soon to be released
platform and strictly confidential at this stage.

Boris, thanks for the patch. On seeing your patch:
+void *__sse_memcpy(void *to, const void *from, size_t len)
+ unsigned long src = (unsigned long)from;
+ unsigned long dst = (unsigned long)to;
+ void *p = to;
+ int i;
+ if (in_interrupt())
+ return __memcpy(to, from, len)
So what is the reason we cannot use sse_memcpy in interrupt context.
(fpu registers not saved ? )

Because, AFAICT, when we handle an #NM exception while running
sse_memcpy in an IRQ handler, we might need to allocate FPU save state
area, which in turn, can sleep. Then, we might get another IRQ while
sleeping and we should be deadlocked.

But let me stress on the "AFAICT" above, someone who actually knows the
FPU code should correct me if I'm missing something.

I don't think you ever get #NM as a result of kernel_fpu_begin, but you can certainly have problems when kernel_fpu_begin nests by accident. There's irq_fpu_usable() for this.

(irq_fpu_usable() reads cr0 sometimes and I suspect it can be slow.)

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at