Re: [PATCH v6 06/25] x86/fpu/xstate: Opt-in kernel dynamic bits when calculate guest xstate size

From: Sean Christopherson
Date: Tue Oct 24 2023 - 13:07:19 EST


On Fri, Sep 15, 2023, Weijiang Yang wrote:
> On 9/15/2023 1:40 AM, Dave Hansen wrote:
> > On 9/13/23 23:33, Yang Weijiang wrote:
> > > --- a/arch/x86/kernel/fpu/xstate.c
> > > +++ b/arch/x86/kernel/fpu/xstate.c
> > > @@ -1636,9 +1636,17 @@ static int __xstate_request_perm(u64 permitted, u64 requested, bool guest)
> > > /* Calculate the resulting kernel state size */
> > > mask = permitted | requested;
> > > - /* Take supervisor states into account on the host */
> > > + /*
> > > + * Take supervisor states into account on the host. And add
> > > + * kernel dynamic xfeatures to guest since guest kernel may
> > > + * enable corresponding CPU feaures and the xstate registers
> > > + * need to be saved/restored properly.
> > > + */
> > > if (!guest)
> > > mask |= xfeatures_mask_supervisor();
> > > + else
> > > + mask |= fpu_kernel_dynamic_xfeatures;

This looks wrong. Per commit 781c64bfcb73 ("x86/fpu/xstate: Handle supervisor
states in XSTATE permissions"), mask at this point only contains user features,
which somewhat unintuitively doesn't include CET_USER (I get that they're MSRs
and thus supervisor state, it's just the name that's odd).

IIUC, the "dynamic" features contains CET_KERNEL, whereas xfeatures_mask_supervisor()
conatins PASID, CET_USER, and CET_KERNEL. PASID isn't virtualized by KVM, but
doesn't that mean CET_USER will get dropped/lost if userspace requests AMX/XTILE
enabling?

The existing code also seems odd, but I might be missing something. Won't the
kernel drop PASID if the guest request AMX/XTILE? I'm not at all familiar with
what PASID state is managed via XSAVE, so I've no idea if that's an actual problem
or just an oddity.

> > > ksize = xstate_calculate_size(mask, compacted);
> > Heh, you changed the "guest" naming in "fpu_kernel_dynamic_xfeatures"
> > but didn't change the logic.
> >
> > As it's coded at the moment *ALL* "fpu_kernel_dynamic_xfeatures" are
> > guest xfeatures. So, they're different in name only.

...

> > Would there ever be any reason for KVM to be on a system which supports a
> > dynamic kernel feature but where it doesn't get enabled for guest use, or
> > at least shouldn't have the FPU space allocated?
>
> I haven't heard of that kind of usage for other features so far, CET
> supervisor xstate is the only dynamic kernel feature now,  not sure whether
> other CPU features having supervisor xstate would share the handling logic
> like CET does one day.

There are definitely scenarios where CET will not be exposed to KVM guests, but
I don't see any reason to make the guest FPU space dynamically sized for CET.
It's what, 40 bytes?

I would much prefer to avoid the whole "dynamic" thing and instead make CET
explicitly guest-only. E.g. fpu_kernel_guest_only_xfeatures? Or even better
if it doesn't cause weirdness elsewhere, a dedicated fpu_guest_cfg. For me at
least, a fpu_guest_cfg would make it easier to understand what all is going on.