Re: [PATCH v6 19/25] KVM: VMX: Emulate read and write to CET MSRs

From: Sean Christopherson
Date: Thu Nov 02 2023 - 19:58:55 EST


On Thu, Nov 02, 2023, Maxim Levitsky wrote:
> On Wed, 2023-11-01 at 09:31 -0700, Sean Christopherson wrote:
> > On Tue, Oct 31, 2023, Maxim Levitsky wrote:
> > > On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote:
> > > > Add emulation interface for CET MSR access. The emulation code is split
> > > > into common part and vendor specific part. The former does common check
> > > > for MSRs and reads/writes directly from/to XSAVE-managed MSRs via the
> > > > helpers while the latter accesses the MSRs linked to VMCS fields.
> > > >
> > > > Signed-off-by: Yang Weijiang <weijiang.yang@xxxxxxxxx>
> > > > ---
> >
> > ...
> >
> > > > + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> > > > + case MSR_KVM_SSP:
> > > > + if (host_msr_reset && kvm_cpu_cap_has(X86_FEATURE_SHSTK))
> > > > + break;
> > > > + if (!guest_can_use(vcpu, X86_FEATURE_SHSTK))
> > > > + return 1;
> > > > + if (index == MSR_KVM_SSP && !host_initiated)
> > > > + return 1;
> > > > + if (is_noncanonical_address(data, vcpu))
> > > > + return 1;
> > > > + if (index != MSR_IA32_INT_SSP_TAB && !IS_ALIGNED(data, 4))
> > > > + return 1;
> > > > + break;
> > > Once again I'll prefer to have an ioctl for setting/getting SSP, this will
> > > make the above code simpler (e.g there will be no need to check that write
> > > comes from the host/etc).
> >
> > I don't think an ioctl() would be simpler overall, especially when factoring in
> > userspace. With a synthetic MSR, we get the following quite cheaply:
> >
> > 1. Enumerating support to userspace.
> > 2. Save/restore of the value, e.g. for live migration.
> > 3. Vendor hooks for propagating values to/from the VMCS/VMCB.
> >
> > For an ioctl(),
> > #1 would require a capability, #2 (and #1 to some extent) would
> > require new userspace flows, and #3 would require new kvm_x86_ops hooks.
> >
> > The synthetic MSR adds a small amount of messiness, as does bundling
> > MSR_IA32_INT_SSP_TAB with the other shadow stack MSRs. The bulk of the mess comes
> > from the need to allow userspace to write '0' when KVM enumerated supported to
> > userspace.
>
> Let me put it this way - all hacks start like that, and in this case this is API/ABI hack
> so we will have to live with it forever.

Eh, I don't view it as a hack, at least the kind of hack that has a negative
connotation. KVM effectively has ~240 MSR indices reserved for whatever KVM
wants. The only weird thing about this one is that it's not accessible from the
guest. Which I agree is quite weird, but from a code perspective I think it
works quite well.

> Once there is a precedent, trust me there will be 10s of new 'fake' msrs added, and the
> interface will become one big mess.

That suggests MSRs aren't already one big mess. :-) I'm somewhat joking, but also
somewhat serious. I really don't think that adding one oddball synthetic MSR is
going to meaningfully move the needle on the messiness of MSRs.

Hmm, there probably is a valid slippery slope argument though. As you say, at
some point, enough state will get shoved into hardware that KVM will need an ever
growing number of synthetic MSRs to keep pace.

> As I suggested, if you don't want to add new capability/ioctl and vendor
> callback per new x86 arch register, then let's implement
> KVM_GET_ONE_REG/KVM_SET_ONE_REG and then it will be really easy to add new
> regs without confusing users, and without polluting msr namespace with msrs
> that don't exist.

I definitely don't hate the idea of KVM_{G,S}ET_ONE_REG, what I don't want is to
have an entirely separate path in KVM for handling the actual get/set.

What if we combine the approaches? Add KVM_{G,S}ET_ONE_REG support so that the
uAPI can use completely arbitrary register indices without having to worry about
polluting the MSR space and making MSR_KVM_SSP ABI.

Ooh, if we're clever, I bet we can extend KVM_{G,S}ET_ONE_REG to also work with
existing MSRs, GPRs, and other stuff, i.e. not force userspace through the funky
KVM_SET_MSRS just to set one reg, and not force a RMW of all GPRs just to set
RIP or something. E.g. use bits 39:32 of the id to encode the register class,
bits 31:0 to hold the index within a class, and reserve bits 63:40 for future
usage.

Then for KVM-defined registers, we can route them internally as needed, e.g. we
can still define MSR_KVM_SSP so that internal it's treated like an MSR, but its
index isn't ABI and so can be changed at will. And future KVM-defined registers
wouldn't _need_ to be treated like MSRs, i.e. we could route registers through
the MSR APIs if and only if it makes sense to do so.

Side topic, why on earth is the data value of kvm_one_reg "addr"?