Re: [PATCH v3 09/13] KVM: xen: automatically use the vcpu_info embedded in shared_info

From: Paul Durrant
Date: Mon Sep 18 2023 - 12:36:30 EST


On 18/09/2023 17:07, David Woodhouse wrote:
On Mon, 2023-09-18 at 14:41 +0000, Paul Durrant wrote:
From: Paul Durrant <pdurrant@xxxxxxxxxx>

The VMM should only need to set the KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
attribute in response to a VCPUOP_register_vcpu_info hypercall. We can
handle the default case internally since we already know where the
shared_info page is. Modify get_vcpu_info_cache() to pass back a pointer
to the shared info pfn cache (and appropriate offset) for any of the
first 32 vCPUs if the attribute has not been set.

A VMM will be able to determine whether it needs to set up default
vcpu_info using the previously defined KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA
Xen capability flag, which will be advertized in a subsequent patch.

Also update the KVM API documentation to describe the new behaviour.

Signed-off-by: Paul Durrant <pdurrant@xxxxxxxxxx>
---
Cc: Sean Christopherson <seanjc@xxxxxxxxxx>
Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: David Woodhouse <dwmw2@xxxxxxxxxxxxx>
Cc: x86@xxxxxxxxxx

v3:
 - Add a note to the API documentation discussing vcpu_info copying.

v2:
 - Dispense with the KVM_XEN_HVM_CONFIG_DEFAULT_VCPU_INFO capability.
 - Add API documentation.
---
 Documentation/virt/kvm/api.rst | 22 +++++++++++++++-------
 arch/x86/kvm/xen.c             | 15 +++++++++++++++
 2 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index e9df4df6fe48..47bf3db74674 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -5442,13 +5442,7 @@ KVM_XEN_ATTR_TYPE_LONG_MODE
 KVM_XEN_ATTR_TYPE_SHARED_INFO
   Sets the guest physical frame number at which the Xen shared_info
-  page resides. Note that although Xen places vcpu_info for the first
-  32 vCPUs in the shared_info page, KVM does not automatically do so
-  and instead requires that KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO be used
-  explicitly even when the vcpu_info for a given vCPU resides at the
-  "default" location in the shared_info page. This is because KVM may
-  not be aware of the Xen CPU id which is used as the index into the
-  vcpu_info[] array, so may know the correct default location.
+  page resides.
   Note that the shared_info page may be constantly written to by KVM;
   it contains the event channel bitmap used to deliver interrupts to
@@ -5564,12 +5558,26 @@ type values:
 KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
   Sets the guest physical address of the vcpu_info for a given vCPU.
+  The vcpu_info for the first 32 vCPUs defaults to the structures
+  embedded in the shared_info page.

The above is true only if KVM has KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA.
You kind of touch on that next, but perhaps the 'if the KVM_...'
condition should be moved up?

+  If the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA flag is also set in the
+  Xen capabilities then the VMM is not required to set this default
+  location; KVM will handle that internally. Otherwise this attribute
+  must be set for all vCPUs.
+
   As with the shared_info page for the VM, the corresponding page may be
   dirtied at any time if event channel interrupt delivery is enabled, so
   userspace should always assume that the page is dirty without relying
   on dirty logging. Setting the gpa to KVM_XEN_INVALID_GPA will disable
   the vcpu_info.
+  Note that, if the guest sets an explicit vcpu_info location in guest
+  memory then the VMM is expected to copy the content of the structure
+  embedded in the shared_info page to the new location. It is therefore
+  important that no event delivery is in progress at this time, otherwise
+  events may be missed.


That's difficult. It means tearing down all interrupts from passthrough
devices which are mapped via PIRQs, and also all IPIs.

So those don't honour event channel masking? That seems like a problem.


The IPI code *should* be able to fall back to just letting the VMM
handle the hypercall in userspace. But PIRQs are harder. I'd be happier
if our plan — handwavy though it may be — led to being able to use the
existing slow path for delivering interrupts by just *invalidating* the
cache. Maybe we *should* move the memcpy into the kernel, and let it
lock *both* the shinfo and new vcpu_info caches while it's doing the
copy? Given that that's the only valid transition, that shouldn't be so
hard, should it?


No, it just kind of oversteps the remit of the attribute... but I'll try adding it and see how messy it gets.

Paul

 KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO
   Sets the guest physical address of an additional pvclock structure
   for a given vCPU. This is typically used for guest vsyscall support.
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 459f3ca4710e..660a808c0b50 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -491,6 +491,21 @@ static void kvm_xen_inject_vcpu_vector(struct kvm_vcpu *v)
 static struct gfn_to_pfn_cache *get_vcpu_info_cache(struct kvm_vcpu *v, unsigned long *offset)
 {
+       if (!v->arch.xen.vcpu_info_cache.active && v->arch.xen.vcpu_id < MAX_VIRT_CPUS) {
+               struct kvm *kvm = v->kvm;
+
+               if (offset) {
+                       if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode)
+                               *offset = offsetof(struct shared_info,
+                                                  vcpu_info[v->arch.xen.vcpu_id]);
+                       else
+                               *offset = offsetof(struct compat_shared_info,
+                                                  vcpu_info[v->arch.xen.vcpu_id]);
+               }
+
+               return &kvm->arch.xen.shinfo_cache;
+       }
+
        if (offset)
                *offset = 0;