Re: [RFC PATCH 0/3] kernel: add support for 256-bit IO access
From: Andy Lutomirski
Date: Wed Mar 21 2018 - 11:46:31 EST
On Wed, Mar 21, 2018 at 6:32 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> * Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
>> And even if you ignore that "maintenance problems down the line" issue
>> ("we can fix them when they happen") I don't want to see games like
>> this, because I'm pretty sure it breaks the optimized xsave by tagging
>> the state as being dirty.
>
> That's true - and it would penalize the context switch cost of the affected task
> for the rest of its lifetime, as I don't think there's much that clears XINUSE
> other than a FINIT, which is rarely done by user-space.
>
>> So no. Don't use vector stuff in the kernel. It's not worth the pain.
>
> I agree, but:
>
>> The *only* valid use is pretty much crypto, and even there it has had issues.
>> Benchmarks use big arrays and/or dense working sets etc to "prove" how good the
>> vector version is, and then you end up in situations where it's used once per
>> fairly small packet for an interrupt, and it's actually much worse than doing it
>> by hand.
>
> That's mainly because the XSAVE/XRESTOR done by kernel_fpu_begin()/end() is so
> expensive, so this argument is somewhat circular.
If we do the deferred restore, then the XSAVE/XRSTOR happens at most
once per kernel entry, which isn't so bad IMO. Also, with PTI, kernel
entries are already so slow that this will be mostly in the noise :(
--Andy