Re: [PATCH v2 01/10] KVM: Document KVM_MAP_MEMORY ioctl

From: Edgecombe, Rick P
Date: Mon Apr 15 2024 - 19:27:38 EST


Nits only...

On Wed, 2024-04-10 at 15:07 -0700, isaku.yamahata@xxxxxxxxx wrote:
> From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
>
> Adds documentation of KVM_MAP_MEMORY ioctl. [1]
>
> It populates guest memory.  It doesn't do extra operations on the
> underlying technology-specific initialization [2].  For example,
> CoCo-related operations won't be performed.  Concretely for TDX, this API
> won't invoke TDH.MEM.PAGE.ADD() or TDH.MR.EXTEND().  Vendor-specific APIs
> are required for such operations.
>
> The key point is to adapt of vcpu ioctl instead of VM ioctl.

Not sure what you are trying to say here.

>   First,
> populating guest memory requires vcpu.  If it is VM ioctl, we need to pick
> one vcpu somehow.  Secondly, vcpu ioctl allows each vcpu to invoke this
> ioctl in parallel.  It helps to scale regarding guest memory size, e.g.,
> hundreds of GB.

I guess you are explaining why this is a vCPU ioctl instead of a KVM ioctl. Is
this clearer:

Although the operation is sort of a VM operation, make the ioctl a vCPU ioctl
instead of KVM ioctl. Do this because a vCPU is needed internally for the fault
path anyway, and because... (I don't follow the second point).

>
> [1] https://lore.kernel.org/kvm/Zbrj5WKVgMsUFDtb@xxxxxxxxxx/
> [2] https://lore.kernel.org/kvm/Ze-TJh0BBOWm9spT@xxxxxxxxxx/
>
> Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> ---
> v2:
> - Make flags reserved for future use. (Sean, Michael)
> - Clarified the supposed use case. (Kai)
> - Dropped source member of struct kvm_memory_mapping. (Michael)
> - Change the unit from pages to bytes. (Michael)
> ---
>  Documentation/virt/kvm/api.rst | 52 ++++++++++++++++++++++++++++++++++
>  1 file changed, 52 insertions(+)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index f0b76ff5030d..6ee3d2b51a2b 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6352,6 +6352,58 @@ a single guest_memfd file, but the bound ranges must
> not overlap).
>  
>  See KVM_SET_USER_MEMORY_REGION2 for additional details.
>  
> +4.143 KVM_MAP_MEMORY
> +------------------------
> +
> +:Capability: KVM_CAP_MAP_MEMORY
> +:Architectures: none
> +:Type: vcpu ioctl
> +:Parameters: struct kvm_memory_mapping (in/out)
> +:Returns: 0 on success, < 0 on error
> +
> +Errors:
> +
> +  ========== =============================================================
> +  EINVAL     invalid parameters
> +  EAGAIN     The region is only processed partially.  The caller should
> +             issue the ioctl with the updated parameters when `size` > 0.
> +  EINTR      An unmasked signal is pending.  The region may be processed
> +             partially.
> +  EFAULT     The parameter address was invalid.  The specified region
> +             `base_address` and `size` was invalid.  The region isn't
> +             covered by KVM memory slot.
> +  EOPNOTSUPP The architecture doesn't support this operation. The x86 two
> +             dimensional paging supports this API.  the x86 kvm shadow mmu
> +             doesn't support it.  The other arch KVM doesn't support it.
> +  ========== =============================================================
> +
> +::
> +
> +  struct kvm_memory_mapping {
> +       __u64 base_address;
> +       __u64 size;
> +       __u64 flags;
> +  };
> +
> +KVM_MAP_MEMORY populates guest memory with the range, `base_address` in (L1)
> +guest physical address(GPA) and `size` in bytes.  `flags` must be zero.  It's
> +reserved for future use.  When the ioctl returns, the input values are
> updated
> +to point to the remaining range.  If `size` > 0 on return, the caller should
> +issue the ioctl with the updated parameters.
> +
> +Multiple vcpus are allowed to call this ioctl simultaneously.  It's not
> +mandatory for all vcpus to issue this ioctl.  A single vcpu can suffice.
> +Multiple vcpus invocations are utilized for scalability to process the
> +population in parallel.  If multiple vcpus call this ioctl in parallel, it
> may
> +result in the error of EAGAIN due to race conditions.
> +
> +This population is restricted to the "pure" population without triggering
> +underlying technology-specific initialization.  For example, CoCo-related
> +operations won't perform.  In the case of TDX, this API won't invoke
> +TDH.MEM.PAGE.ADD() or TDH.MR.EXTEND().  Vendor-specific uAPIs are required
> for
> +such operations.

Probably don't want to have TDX bits in here yet. Since it's talking about what
KVM_MAP_MEMORY is *not* doing, it can just be dropped.

> +
> +
>  5. The kvm_run structure
>  ========================
>