Re: [PATCH v7 02/26] x86/fpu/xstate: Refine CET user xstate bit enabling

From: Maxim Levitsky
Date: Wed Dec 13 2023 - 08:32:02 EST


On Wed, 2023-12-13 at 17:30 +0800, Yang, Weijiang wrote:
> On 12/8/2023 11:15 PM, Maxim Levitsky wrote:
> > On Fri, 2023-12-08 at 22:57 +0800, Yang, Weijiang wrote:
> > > On 12/6/2023 11:57 PM, Maxim Levitsky wrote:
> > > > On Wed, 2023-12-06 at 09:03 +0800, Yang, Weijiang wrote:
> > > > > On 12/5/2023 5:53 PM, Maxim Levitsky wrote:
> > > > > > On Fri, 2023-12-01 at 14:51 +0800, Yang, Weijiang wrote:
> > > > > > > On 12/1/2023 1:26 AM, Maxim Levitsky wrote:
> > > > > > > > On Fri, 2023-11-24 at 00:53 -0500, Yang Weijiang wrote:
> > > > > > > > > Remove XFEATURE_CET_USER entry from dependency array as the entry doesn't
> > > > > > > > > reflect true dependency between CET features and the user xstate bit.
> > > > > > > > > Enable the bit in fpu_kernel_cfg.max_features when either SHSTK or IBT is
> > > > > > > > > available.
> > > > > > > > >
> > > > > > > > > Both user mode shadow stack and indirect branch tracking features depend
> > > > > > > > > on XFEATURE_CET_USER bit in XSS to automatically save/restore user mode
> > > > > > > > > xstate registers, i.e., IA32_U_CET and IA32_PL3_SSP whenever necessary.
> > > > > > > > >
> > > > > > > > > Note, the issue, i.e., CPUID only enumerates IBT but no SHSTK is resulted
> > > > > > > > > from CET KVM series which synthesizes guest CPUIDs based on userspace
> > > > > > > > > settings,in real world the case is rare. In other words, the exitings
> > > > > > > > > dependency check is correct when only user mode SHSTK is available.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Yang Weijiang <weijiang.yang@xxxxxxxxx>
> > > > > > > > > Reviewed-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
> > > > > > > > > Tested-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
> > > > > > > > > ---
> > > > > > > > > arch/x86/kernel/fpu/xstate.c | 9 ++++++++-
> > > > > > > > > 1 file changed, 8 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> > > > > > > > > index 73f6bc00d178..6e50a4251e2b 100644
> > > > > > > > > --- a/arch/x86/kernel/fpu/xstate.c
> > > > > > > > > +++ b/arch/x86/kernel/fpu/xstate.c
> > > > > > > > > @@ -73,7 +73,6 @@ static unsigned short xsave_cpuid_features[] __initdata = {
> > > > > > > > > [XFEATURE_PT_UNIMPLEMENTED_SO_FAR] = X86_FEATURE_INTEL_PT,
> > > > > > > > > [XFEATURE_PKRU] = X86_FEATURE_OSPKE,
> > > > > > > > > [XFEATURE_PASID] = X86_FEATURE_ENQCMD,
> > > > > > > > > - [XFEATURE_CET_USER] = X86_FEATURE_SHSTK,
> > > > > > > > > [XFEATURE_XTILE_CFG] = X86_FEATURE_AMX_TILE,
> > > > > > > > > [XFEATURE_XTILE_DATA] = X86_FEATURE_AMX_TILE,
> > > > > > > > > };
> > > > > > > > > @@ -798,6 +797,14 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
> > > > > > > > > fpu_kernel_cfg.max_features &= ~BIT_ULL(i);
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > + /*
> > > > > > > > > + * CET user mode xstate bit has been cleared by above sanity check.
> > > > > > > > > + * Now pick it up if either SHSTK or IBT is available. Either feature
> > > > > > > > > + * depends on the xstate bit to save/restore user mode states.
> > > > > > > > > + */
> > > > > > > > > + if (boot_cpu_has(X86_FEATURE_SHSTK) || boot_cpu_has(X86_FEATURE_IBT))
> > > > > > > > > + fpu_kernel_cfg.max_features |= BIT_ULL(XFEATURE_CET_USER);
> > > > > > > > > +
> > > > > > > > > if (!cpu_feature_enabled(X86_FEATURE_XFD))
> > > > > > > > > fpu_kernel_cfg.max_features &= ~XFEATURE_MASK_USER_DYNAMIC;
> > > > > > > > >
> > > > > > > > I am curious:
> > > > > > > >
> > > > > > > > Any reason why my review feedback was not applied even though you did agree
> > > > > > > > that it is reasonable?
> > > > > > > My apology! I changed the patch per you feedback but found XFEATURE_CET_USER didn't
> > > > > > > work before sending out v7 version, after a close look at the existing code:
> > > > > > >
> > > > > > > for (i = 0; i < ARRAY_SIZE(xsave_cpuid_features); i++) {
> > > > > > > unsigned short cid = xsave_cpuid_features[i];
> > > > > > >
> > > > > > > /* Careful: X86_FEATURE_FPU is 0! */
> > > > > > > if ((i != XFEATURE_FP && !cid) || !boot_cpu_has(cid))
> > > > > > > fpu_kernel_cfg.max_features &= ~BIT_ULL(i);
> > > > > > > }
> > > > > > >
> > > > > > > With removal of XFEATURE_CET_USER entry from xsave_cpuid_features, actually
> > > > > > > above check will clear the bit from fpu_kernel_cfg.max_features.
> > > > > > Are you sure about this? If we remove the XFEATURE_CET_USER from the xsave_cpuid_features,
> > > > > > then the above loop will not touch it - it loops only over the items in the xsave_cpuid_features
> > > > > > array.
> > > > > No, the code is a bit tricky, the actual array size is XFEATURE_XTILE_DATA( ie, 18) + 1, those xfeature bits not listed in init code leave a blank entry with xsave_cpuid_features[i] == 0, so for the blank elements, the loop hits (i != XFEATURE_FP && !cid) then the relevant xfeature bit for i is cleared in fpu_kernel_cfg.max_features. I had the same illusion at first when I replied your comments in v6, and modified the code as you suggested but found the issue during tests. Please double check it.
> > > > Oh I see now. IMHO the current code is broken, or at least it violates the
> > > > 'Clear XSAVE features that are disabled in the normal CPUID' comment, because
> > > > it also clears all xfeatures which have no CPUID bit in the table (except FPU,
> > > > for which we have a workaround).
> > > >
> > > >
> > > > How about we do this instead:
> > > >
> > > > for (i = 0; i < ARRAY_SIZE(xsave_cpuid_features); i++) {
> > > > unsigned short cid = xsave_cpuid_features[i];
> > > >
> > > > if (cid && !boot_cpu_has(cid))
> > > > fpu_kernel_cfg.max_features &= ~BIT_ULL(i);
> > > > }
> > > I think existing code is more reasonable, the side-effect of current code, i.e., masking out
> > > the unclaimed xfeature bits, sanitizes the bits in max_features at the first place, then following calculations derived from it become reasonable too.
> >
> > I strongly disagree with that. Kernel already removes all features bits which it knows nothing about.
> >
> > There is no need to also remove the xfeatures that it knows about but knows nothing about a CPUID bit.
> > For such features the kernel needs either to accept it (like FPU) or remove the feature from set of supported features.
>
> Let me involve Chang, the author of the code in question.
>
> Hi, Chang,
> In commit 70c3f1671b0c ("x86/fpu/xstate: Prepare XSAVE feature table for gaps in state component numbers"),
> you modified the loop as below:
> for (i = 0; i < ARRAY_SIZE(xsave_cpuid_features); i++) {
> - if (!boot_cpu_has(xsave_cpuid_features[i]))
> + unsigned short cid = xsave_cpuid_features[i];
> +
> + /* Careful: X86_FEATURE_FPU is 0! */
> + if ((i != XFEATURE_FP && !cid) || !boot_cpu_has(cid))
> fpu_kernel_cfg.max_features &= ~BIT_ULL(i);
> }
>
> IMHO the change resulted functional change of the loop, i.e., before that only it only clears the bits without CPUIDs,
> but after the change, the side-effect of the loop will clear the bits of blank entries ( where xsave_cpuid_features[i] == 0 )
> since the loop hits (i != XFEATURE_FP && !cid), is it intended or something else?
>
>
100% agree.

Best regards,
Maxim Levitsky