Re: [PATCH RFC 0/3] x86/fpu: defer FPU state loading until return to userspace

From: Ingo Molnar
Date: Tue Oct 18 2016 - 03:58:29 EST



* riel@xxxxxxxxxx <riel@xxxxxxxxxx> wrote:

> These patches defer FPU state loading until return to userspace.
>
> This has the advantage of not clobbering the FPU state of one task
> with that of another, when that other task only stays in kernel mode.
>
> It also allows us to skip the FPU restore in kernel_fpu_end(), which
> will help tasks that do multiple invokations of kernel_fpu_begin/end
> without returning to userspace, for example KVM VCPU tasks.
>
> We could also skip the restore of the KVM VCPU guest FPU state at
> guest entry time, if it is still valid, but I have not implemented
> that yet.
>
> The code that loads FPU context directly into registers from user
> space memory, or saves directly to user space memory, is wrapped
> in a retry loop, that ensures the FPU state is correctly set up
> at the start, and verifies that it is still valid at the end.
>
> I have stress tested these patches with various FPU test programs,
> and things seem to survive.
>
> However, I have not found any good test suites that mix FPU
> use and signal handlers. Close scrutiny of these patches would
> be appreciated.

BTW., for the next version it would be nice to also have a benchmark that shows
the advantages (and proves that it's not causing measurable overhead elsewhere).

Either an FPU-aware extension to 'perf bench sched' or a separate 'perf bench fpu'
suite would be nice.

Thanks,

Ingo