Re: [PATCH 0/15] x86/xsaves: Optimize xstate context switch by xsaves/xrstors

From: Andy Lutomirski
Date: Mon May 26 2014 - 17:40:33 EST


On Mon, May 26, 2014 at 1:13 PM, Yu, Fenghua <fenghua.yu@xxxxxxxxx> wrote:
>> From: Andy Lutomirski [mailto:luto@xxxxxxxxxxxxxx]
>> On 05/26/2014 10:01 AM, Fenghua Yu wrote:
>> > From: Fenghua Yu <fenghua.yu@xxxxxxxxx>
>> >
>> > With ever growing extended state registers (xstate) on x86 processors,
>> kernel
>> > needs to cope with issue of growing memory space occupied by xstate.
>> The xsave
>> > area is holding more and more xstate registers, growing from legacy
>> FP and
>> > SSE to AVX, AVX2, AVX-512, MPX, and Intel PT.
>> >
>> > The recently introduced compacted format of xsave area saves xstates
>> only
>> > for enabled states. This patch set saves the xsave area space per
>> process
>> > in compacted format by xsaves/xrstors instructions.
>>
>> Are we going to want to encourage userspace to do something like
>> sticking vzeroupper right before each syscall to make any
>> xsaves/xrestores faster?
>
> This patch set allow compacted format in kernel and standard format
> in user space. This works fine for both kernel and user application.

My question is purely about optimization: if userspace does a blocking
system call, will it be significantly faster if userspace zeros out as
much of the extended state as possible before doing the system call?

I think I tried this once with xsaveopt and decided that it didn't
make much of a difference.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/