Re: [RFC PATCH 1/2] KVM: x86: Introduce KVM_{G,S}ET_ONE_REG uAPIs support

From: Sean Christopherson
Date: Wed Sep 11 2024 - 10:38:08 EST


On Wed, Sep 11, 2024, Nikolas Wipper wrote:
> On Thu May 9, 2024 at 09:54 AM UTC+0200, Yang Weijiang wrote:
> > Enable KVM_{G,S}ET_ONE_REG uAPIs so that userspace can access HW MSR or
> > KVM synthetic MSR throught it.
> >
> > In CET KVM series [*], KVM "steals" an MSR from PV MSR space and access
> > it via KVM_{G,S}ET_MSRs uAPIs, but the approach pollutes PV MSR space
> > and hides the difference of synthetic MSRs and normal HW defined MSRs.
> >
> > Now carve out a separate room in KVM-customized MSR address space for
> > synthetic MSRs. The synthetic MSRs are not exposed to userspace via
> > KVM_GET_MSR_INDEX_LIST, instead userspace complies with KVM's setup and
> > composes the uAPI params. KVM synthetic MSR indices start from 0 and
> > increase linearly. Userspace caller should tag MSR type correctly in
> > order to access intended HW or synthetic MSR.
> >
> > [*]:
> > https://lore.kernel.org/all/20240219074733.122080-18-weijiang.yang@xxxxxxxxx/
> >
> > Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> > Signed-off-by: Yang Weijiang <weijiang.yang@xxxxxxxxx>
>
> Having this API, and specifically having a definite kvm_one_reg structure
> for x86 registers, would be interesting for register pinning/intercepts.
> With one_reg for x86 the API could be platform agnostic and possible even
> replace MSR filters for x86.

I don't follow. MSR filters let userspace intercept accesses for a variety of
reasons, these APIs simply provide a way to read/write a register value that is
stored in KVM. I don't see how this could replace MSR filters.

> I do have a couple of questions about these patches.
>
> > ---
> > arch/x86/include/uapi/asm/kvm.h | 10 ++++++
> > arch/x86/kvm/x86.c | 62 +++++++++++++++++++++++++++++++++
> > 2 files changed, 72 insertions(+)
> >
> > diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
> > index ef11aa4cab42..ca2a47a85fa1 100644
> > --- a/arch/x86/include/uapi/asm/kvm.h
> > +++ b/arch/x86/include/uapi/asm/kvm.h
> > @@ -410,6 +410,16 @@ struct kvm_xcrs {
> > __u64 padding[16];
> > };
> >
> > +#define KVM_X86_REG_MSR (1 << 2)
> > +#define KVM_X86_REG_SYNTHETIC_MSR (1 << 3)
>
> Why is this a bitfield? As opposed to just counting up?

Hmm, good question. This came from my initial sketch, and it would seem that I
something specific in mind since starting at (1 << 2) is oddly specific, but for
the life of me I can't remember what the plan was. Best guest is that I was
leaving space for '0' and '1' to be regs and sregs? But that still doesn't
explain/justify using a bitfield.

[*] https://lore.kernel.org/all/ZjLE7giCsEI4Sftp@xxxxxxxxxx

>
> #define KVM_X86_REG_MSR 2
> #define KVM_X86_REG_SYNTHETIC_MSR 3
>
> > +
> > +struct kvm_x86_reg_id {
> > + __u32 index;
> > + __u8 type;
> > + __u8 rsvd;
> > + __u16 rsvd16;
> > +};
>
> This struct is opposite to what other architectures do, where they have
> an architecture ID in the upper 32 bits, and the lower 32 bits actually
> identify the register. This would probably make sense for x86 too, to
> avoid conflicts with other IDs (I think MIPS core registers can have IDs
> with the lower 32 bits all zero) so that the IDs are actually unique,
> right?

It's not the opposite, it's just missing fields for the arch and the size. Ugh,
the size is unaligned. That's annoying. Something like this?

struct kvm_x86_reg_id {
__u32 index;
__u8 type;
__u8 rsvd;
__u8 rsvd4:4;
__u8 size:4;
__u8 x86;
}

Though looking at this with fresh eyes, I don't think the above structure should
be exposed to userspace. Userspace will only ever want to encode a register; the
exact register may not be hardcoded, but I would expect the type to always be
known ahead of time, if not outright hardcoded. The struct is really only useful
for the kernel, e.g. to easily switch on the type, extract the index, etc.

As annoying as it can be for a human to decipher the final value, the arm64/riscv
approach of providing builders is probably the way to go, though I think x86 can
be much simpler (less stuff to encode).

Oh! Another thing I think we should do is make KVM_{G,S}ET_ONE_REG 64-bit only
so that we don't have to deal with 32-bit vs. 64-bit GPRs. 32-bit userspace
would need to manually encode the register id, but I have no problem making life
difficult for such setups. Or KVM could reject the ioctl for .compat_ioctl(),
but that seems unnecessary.

E.g. since IIUC switch() and if() statements are off-limits in uapi headers...

#define KVM_X86_REG_TYPE_MSR 2ull

#define KVM_x86_REG_TYPE_SIZE(type) \
{( \
__u64 type_size = type; \
\
type_size |= type == KVM_X86_REG_TYPE_MSR ? KVM_REG_SIZE_U64 : \
type == KVM_X86_REG_TYPE_SYNTHETIC_MSR ? KVM_REG_SIZE_U64 :\
0; \
type_size; \
})

#define KVM_X86_REG_ENCODE(type, index) \
(KVM_REG_X86 | KVM_X86_REG_TYPE_SIZE(type) | index)

#define KVM_X86_REG_MSR(index) KVM_X86_REG_ENCODE(KVM_X86_REG_TYPE_MSR, index)