Re: [PATCH v13 016/113] KVM: TDX: x86: Add ioctl to get TDX systemwide parameters

From: Isaku Yamahata
Date: Wed Apr 05 2023 - 14:07:52 EST


On Mon, Apr 03, 2023 at 05:28:35PM +0300,
Zhi Wang <zhi.wang.linux@xxxxxxxxx> wrote:

> On Mon, 3 Apr 2023 11:46:15 +0800
> Xiaoyao Li <xiaoyao.li@xxxxxxxxx> wrote:
>
> > On 3/31/2023 8:44 PM, Zhi Wang wrote:
> > > On Thu, 30 Mar 2023 17:18:03 -0700
> > > Isaku Yamahata <isaku.yamahata@xxxxxxxxx> wrote:
> > >
> > >> On Wed, Mar 29, 2023 at 04:17:22PM -0700,
> > >> Isaku Yamahata <isaku.yamahata@xxxxxxxxx> wrote:
> > >>
> > >>> On Sat, Mar 25, 2023 at 10:43:06AM +0200,
> > >>> Zhi Wang <zhi.wang.linux@xxxxxxxxx> wrote:
> > >>>
> > >>>> On Sun, 12 Mar 2023 10:55:40 -0700
> > >>>> isaku.yamahata@xxxxxxxxx wrote:
> > >>>>
> > >>>> Does this have to be a new generic ioctl with a dedicated new x86_ops? SNP
> > >>>> does not use it at all and all the system-scoped ioctl of SNP going through
> > >>>> the CCP driver. So getting system-scope information of TDX/SNP will end up
> > >>>> differently.
> > >>>>
> > >>>> Any thought, Sean? Moving getting SNP system-wide information to
> > >>>> KVM dev ioctl seems not ideal and TDX does not have a dedicated driver like
> > >>>> CCP. Maybe make this ioctl TDX-specific? KVM_TDX_DEV_OP?
> > >>>
> > >>> We only need global parameters of the TDX module, and we don't interact with TDX
> > >>> module at this point. One alternative is to export those parameters via sysfs.
> > >>> Also the existence of the sysfs node indicates that the TDX module is
> > >>> loaded(initialized?) or not in addition to boot log. Thus we can drop system
> > >>> scope one.
> > >>> What do you think?
> > >>>
> > >
> > > I like this idea and the patch below, it feels right for me now. It would be nice
> > > if more folks can chime in and comment.
> >
> > SYSFS option requires CONFIG_SYSFS, which reqiures CONFIG_KVM_TDX to
> > select CONFIG_SYSFS.
> >
> > >>> Regarding to other TDX KVM specific ioctls (KVM_TDX_INIT_VM, KVM_TDX_INIT_VCPU,
> > >>> KVM_TDX_INIT_MEM_REGION, and KVM_TDX_FINALIZE_VM), they are specific to KVM. So
> > >>> I don't think it can be split out to independent driver.
> > >>
> > >
> > > They can stay in KVM as they are KVM-specific. SNP also has KVM-specific ioctls
> > > which wraps the SEV driver calls. At this level, both TDX and SNP go their specific
> > > implementation without more abstraction other than KVM_ENCRYPT_MEMORY_OP. Their
> > > strategies are aligned.
> > >
> > > The problem of the previous approach was the abstraction that no other implementation
> > > is using it. It is like, TDX wants a higher abstraction to cover both TDX and SNP,
> > > but SNP is not using it, which makes the abstraction looks strange.
> >
> > Note, before this TDX enabling series, KVM_MEMORY_ENCRYPT_OP is a VM
> > scope ioctl, that only serves for SEV and no other implementation uses
> > it. I see no reason why cannot introduce a new IOCTL in x86 KVM that
> > serves only one vendor.
> >
>
> My point is: time is different. When KVM_MEMORY_ENCRYPT_OP is there,
> there was *only* one vendor and SEV/SNP didn't know how the future vendor
> is going to use the ioctl. That is a reasonable case an generic ioctl can
> have one vendor to back up.
>
> The background here is: now another vendor is coming and there are going to
> be two vendors. The two vendors' flows are much clearer than early stage.
> Like, they know which flow is going to be used by each other.
>
> With these kept in mind, IMHO, it is not appropriate to introduce
> an generic ioctl that only one vendor is going to use, meanwhile
> we have already known another vendor is not going to use it.
>
> Defining a new userspace ABI is a serious thing and it is not an early
> stage anymore. Actually I think it is the best time to see how the
> code infrastructure should be re-purposed at this time.
>
> > We choose KVM_MEMORY_ENCRYPT_OP for TDX platform scope, just because we
> > reuse KVM_MEMORY_ENCRYPT_OP for TDX VM-scope and extend it to TDX vcpu
> > scope. It's just to avoid defining a new IOCTL number.
> >
> > We can rename it to KVM_GET_CC_CAPABILITIES, and even return different
> > capabilities based on VM type. And even, if SNP wants to use it, I think
> > it can wrap SNP driver calls inside this IOCTL?
> >
>
> I am not opposed to this option as it shows effort to improve it and it
> is constructive. But this needs to be figured out with AMD folks and
> maintainers. E.g. what should be the best CC ioctl scheme for KVM?
> vendor-specific or generic, which brings better benefit for the userspace,
> and less maintenance burden.
>
> Back to the reason why I think a vendor-specific sysinfo interface for TDX
> is necessary:
>
> 1) SEV driver has been there for quite some time. Unless people thinks an
> generic CC ioctl scheme is a way to go, then there will be motivation and
> efforts putting on it. The efforts is not only about wrapping SEV ioctls,
> it needs a systematic spec of generic CC ioctl scheme.
>
> 2) TDX doesn't have a driver like SEV and possibly not going to have one in
> the future. For those non-KVM related control flow of TDX in future, they
> can re-use this and stay away from KVM interface. (If vendor-specific
> scheme is the future direction.)
>
> > kvm.ko is special that it needs to serve two vendors. Sometime it's
> > unaviodable that an interface is only used by one vendor.
>
> I am afraid that in this case it is avoidable right?

We can make KVM_TDX_CAPABILITIES vm-scoped one so that devoce-scoped
KVM_EMORY_ENCRYPT_OP isn't needed. At least qemu is fine.

Do you think vm-scoped KVM_TDX_CAPABILITIES is fine?
--
Isaku Yamahata <isaku.yamahata@xxxxxxxxx>