Re: x86 memcpy performance
From: Andrew Lutomirski
Date: Mon Aug 15 2011 - 15:12:05 EST
On Mon, Aug 15, 2011 at 2:49 PM, Borislav Petkov <bp@xxxxxxxxx> wrote:
> On Mon, 15 August, 2011 7:04 pm, Andrew Lutomirski wrote:
>>> Or, if we want to use SSE stuff in the kernel, we might think of
>>> allocating its own FPU context(s) and handle those...
>> I'm thinking of having a stack of FPU states to parallel irq stacks
>> and IST stacks.
> ... I'm guessing with the same nesting as hardirqs? Making FPU
> instructions usable in irq contexts too.
>> It gets a little hairy when code inside kernel_fpu_begin traps for a
>> non-irq non-IST reason, though.
> How does that happen? You're in the kernel with preemption disabled and
> TS cleared, what would cause the #NM? I think that if you need to switch
> context, you simply "push" the current FPU context, allocate a new one
> and clts as part of the FPU context switching, no?
Not #NM, but page faults can happen too (even just accessing vmalloc space).
>> Fortunately, those are rare and all of the EX_TABLE users could mark
>> xmm regs as clobbered (except for copy_from_user...).
> Well, copy_from_user... does a bunch of rep; movsq - if the SSE version
> shows reasonable speedup there, we might need to make those work too.
I'm a little surprised that SSE beats fast string operations, but I
guess benchmarking always wins.
>> Keeping kernel_fpu_begin non-preemptable makes it less bad because the
>> extra FPU state can be per-cpu and not per-task.
>> This is extra fun on 32 bit, which IIRC doesn't have IST stacks.
>> The major speedup will come from saving state in kernel_fpu_begin but
>> not restoring it until the code in entry_??.S restores registers.
> But you'd need to save each kernel FPU state when nesting, no?
Yes. But we don't nest that much, and the save/restore isn't all that
expensive. And we don't have to save/restore unless kernel entries
nest and both entries try to use kernel_fpu_begin at the same time.
This whole project may take awhile. The code in there is a
poorly-documented mess, even after Hans' cleanups. (It's a lot worse
without them, though.)
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/