Re: [PATCH v10 3/9] KVM: Extend the memslot to support fd-based private memory

From: Jarkko Sakkinen
Date: Fri Jan 20 2023 - 18:43:06 EST


On Tue, Jan 10, 2023 at 05:14:32PM +0800, Chao Peng wrote:
> On Mon, Jan 09, 2023 at 07:32:05PM +0000, Sean Christopherson wrote:
> > On Fri, Jan 06, 2023, Chao Peng wrote:
> > > On Thu, Jan 05, 2023 at 11:23:01AM +0000, Jarkko Sakkinen wrote:
> > > > On Fri, Dec 02, 2022 at 02:13:41PM +0800, Chao Peng wrote:
> > > > > To make future maintenance easy, internally use a binary compatible
> > > > > alias struct kvm_user_mem_region to handle both the normal and the
> > > > > '_ext' variants.
> > > >
> > > > Feels bit hacky IMHO, and more like a completely new feature than
> > > > an extension.
> > > >
> > > > Why not just add a new ioctl? The commit message does not address
> > > > the most essential design here.
> > >
> > > Yes, people can always choose to add a new ioctl for this kind of change
> > > and the balance point here is we want to also avoid 'too many ioctls' if
> > > the functionalities are similar. The '_ext' variant reuses all the
> > > existing fields in the 'normal' variant and most importantly KVM
> > > internally can reuse most of the code. I certainly can add some words in
> > > the commit message to explain this design choice.
> >
> > After seeing the userspace side of this, I agree with Jarkko; overloading
> > KVM_SET_USER_MEMORY_REGION is a hack. E.g. the size validation ends up being
> > bogus, and userspace ends up abusing unions or implementing kvm_user_mem_region
> > itself.
>
> How is the size validation being bogus? I don't quite follow. Then we
> will use kvm_userspace_memory_region2 as the KVM internal alias, right?
> I see similar examples use different functions to handle different
> versions but it does look easier if we use alias for this function.
>
> >
> > It feels absolutely ridiculous, but I think the best option is to do:
> >
> > #define KVM_SET_USER_MEMORY_REGION2 _IOW(KVMIO, 0x49, \
> > struct kvm_userspace_memory_region2)
>
> Just interesting, is 0x49 a safe number we can use?
>
> >
> > /* for KVM_SET_USER_MEMORY_REGION2 */
> > struct kvm_user_mem_region2 {
> > __u32 slot;
> > __u32 flags;
> > __u64 guest_phys_addr;
> > __u64 memory_size;
> > __u64 userspace_addr;
> > __u64 restricted_offset;
> > __u32 restricted_fd;
> > __u32 pad1;
> > __u64 pad2[14];
> > }
> >
> > And it's consistent with other KVM ioctls(), e.g. KVM_SET_CPUID2.
>
> Okay, agree from KVM userspace API perspective this is more consistent
> with similar existing examples. I see several of them.
>
> I think we will also need a CAP_KVM_SET_USER_MEMORY_REGION2 for this new
> ioctl.

The current API in the patch set is trivial for C user space but for
any other more "constrained" language such as Rust a new ioctl would be
easier to adapt.

> >
> > Regarding the userspace side of things, please include Vishal's selftests in v11,
> > it's impossible to properly review the uAPI changes without seeing the userspace
> > side of things. I'm in the process of reviewing Vishal's v2[*], I'll try to
> > massage it into a set of patches that you can incorporate into your series.
>
> Previously I included Vishal's selftests in the github repo, but not
> include them in this patch series. It's OK for me to incorporate them
> directly into this series and review together if Vishal is fine.
>
> Chao
> >
> > [*] https://lore.kernel.org/all/20221205232341.4131240-1-vannapurve@xxxxxxxxxx

BR, Jarkko