Re: [PATCH v5 09/19] KVM:x86: Make guest supervisor states as non-XSAVE managed

From: Yang, Weijiang
Date: Tue Aug 29 2023 - 03:06:56 EST


On 8/29/2023 5:00 AM, Dave Hansen wrote:
On 8/10/23 08:15, Paolo Bonzini wrote:
On 8/10/23 16:29, Dave Hansen wrote:
What actual OSes need this support?
I think Xen could use it when running nested.  But KVM cannot expose
support for CET in CPUID, and at the same time fake support for
MSR_IA32_PL{0,1,2}_SSP (e.g. inject a #GP if it's ever written to a
nonzero value).

I suppose we could invent our own paravirtualized CPUID bit for
"supervisor IBT works but supervisor SHSTK doesn't".  Linux could check
that but I don't think it's a good idea.

So... do, or do not.  There is no try. :)
Ahh, that makes sense. This is needed for implementing the
*architecture*, not because some OS actually wants to _do_ it.

...
In a perfect world, we'd just allocate space for CET_S in the KVM
fpstates.  The core kernel fpstates would have
XSTATE_BV[13]==XCOMP_BV[13]==0.  An XRSTOR of the core kernel fpstates
would just set CET_S to its init state.
Yep.  I don't think it's a lot of work to implement.  The basic idea as
you point out below is something like

#define XFEATURE_MASK_USER_DYNAMIC XFEATURE_MASK_XTILE_DATA
#define XFEATURE_MASK_USER_OPTIONAL \
    (XFEATURE_MASK_DYNAMIC | XFEATURE_MASK_CET_KERNEL)

where XFEATURE_MASK_USER_DYNAMIC is used for xfd-related tasks
(including the ARCH_GET_XCOMP_SUPP arch_prctl) but everything else uses
XFEATURE_MASK_USER_OPTIONAL.

KVM would enable the feature by hand when allocating the guest fpstate.
Disabled features would be cleared from EDX:EAX when calling
XSAVE/XSAVEC/XSAVES.
OK, so let's _try_ this perfect-world solution. KVM fpstates get
fpstate->xfeatures[13] set, but no normal task fpstates have that bit
set. Most of the infrastructure should be there to handle this without
much fuss because it _should_ be looking at generic things like
fpstate->size and fpstate->features.

But who knows what trouble this will turn up. It could get nasty and
not worth it, but we should at least try it.

Thanks Dave for clarity!
I'm moving in that direction...