Re: [RFC PATCH] x86/arch_prctl: Add ARCH_SET_XCR0 to mask XCR0 per-thread
From: Dave Hansen
Date: Mon Jun 18 2018 - 11:04:18 EST
On 06/18/2018 07:42 AM, Keno Fischer wrote:
>> There's no way this is even close to viable until it has been made to
>> cope with holes.
>
> Ok, it seems to me the first decision is whether it should be allowed
> to have the compacted format (with holes) in an in-memory xstate
> image. Doing so seems possible, but would require more invasive
> changes to the kernel, so I'm gonna ignore it for now.
>
> If we want to keep the in-memory xstate layout constant after boot
> time, I see four ways to do that for this feature.
>
> 1) Set up XCR0 in a different place, so that the kernel xsaves/xrstors
> use an XCR0 that matches the layout the kernel expects.
... and do XCR0 writes before every XSAVES/XRSTORS, including in the
context-switch path?
> 2) Use xsaveopt when this particular thread flag is set and have the
> kernel be able to deal with non-compressed xstate images, even
> if xsaves is available.
What about if there is supervisor state in place?
> 3) What's in this patch now, but fix up the xstate after saving it.
That's _maybe_ viable. But, it's going to slow things down quite a bit
and has to be done with preempt disabled.
> 4) Catch the fault thrown by xsaves/xrestors in this situation, update
> XCR0, redo the xsaves/restores, put XCR0 back and continue
> execution after the faulting instruction.
I'm worried about the kernel pieces that go digging in the XSAVE data
getting confused more than the hardware getting confused.
> Option 1) seems easiest, but it would also require adding code
> somewhere on the common path, which I assume is a no-go ;).
Yeah, that would be horribly slow.
You then also have to be really careful with interrupts and preempt when
you're in a funky XCR0 configuration.
> Option 3) seems doable entirely in the slow path for this feature.
> If we xsaves with the modified XCR0, we can then fix up the memory
> image to match the expected layout. Similarly, we could xrestors
> in the slow path (from standard layout) and rely on the
> `fpregs_state_valid` mechanism to prevent the fault.
... with one more modification. You need two buffers _anyway_ if you do
this. So you would probably just XSAVE to a new buffer and then convert
that back to the "real" buffer in the thread struct.
> At least currently, it is my understanding that `xfeatures_mask` only has
> user features, am I mistaken about that?
We're slowing adding supervisor support. I think accounting for
supervisor features is a requirement for any new XSAVE code.