Re: [PATCH v11 2/4] arm/arm64: KVM: Add KVM_GET/SET_VCPU_EVENTS

From: gengdongjiu
Date: Thu Apr 12 2018 - 11:11:54 EST


Hi James,
Thanks for the comments.

2018-04-10 22:15 GMT+08:00, James Morse <james.morse@xxxxxxx>:
> Hi Dongjiu Geng,
>
> On 09/04/18 22:36, Dongjiu Geng wrote:
>> This new IOCTL exports user-invisible states related to SError.
>> Together with appropriate user space changes, it can inject
>> SError with specified syndrome to guest by setup kvm_vcpu_events
>> value.
>
>> Also it can support live migration.
>
> Could you explain what user-space is expected to do for this?
> (this is also relevant for snapshot-ing/suspending VMs)
Ok.

>
> It's probably worth noting that this solves an existing problem: KVM may
> make an
> SError pending, but user-space has no way to discover/migrate this.

if KVM make an SError pending, when user-space do migration, it get the
kvm_vcpu_events through KVM_GET_VCPU_EVENTS, then can find that pending status.
What are the things you're worried about?

>
>
>> diff --git a/Documentation/virtual/kvm/api.txt
>> b/Documentation/virtual/kvm/api.txt
>> index 8a3d708..45719b4 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -819,11 +819,13 @@ struct kvm_clock_data {
>>
>> Capability: KVM_CAP_VCPU_EVENTS
>> Extended by: KVM_CAP_INTR_SHADOW
>> -Architectures: x86
>> +Architectures: x86, arm, arm64
>> Type: vm ioctl
>> Parameters: struct kvm_vcpu_event (out)
>> Returns: 0 on success, -1 on error
>>
>> +X86:
>> +
>> Gets currently pending exceptions, interrupts, and NMIs as well as
>> related
>> states of the vcpu.
>>
>> @@ -865,15 +867,31 @@ Only two fields are defined in the flags field:
>> - KVM_VCPUEVENT_VALID_SMM may be set in the flags field to signal that
>> smi contains a valid state.
>>
>> +ARM, ARM64:
>> +
>> +Gets currently pending SError exceptions as well as related states of the
>> vcpu.
>> +
>> +struct kvm_vcpu_events {
>> + struct {
>> + __u8 serror_pending;
>> + __u8 serror_has_esr;
>> + /* Align it to 4 bytes */
>> + __u8 pad[2];
>> + __u64 serror_esr;
>> + } exception;
>> +};
>> +
>
> I'm not convinced we should change this struct from the layout/size x86 has.
> Its
> confusing for the documentation, is this API call really the same on all
> architectures?
>
> What if we want to add some future interrupt, NMI or related state? We've
> found
> ourselves needing to add this API, it seems odd to remove its other uses on
> x86.
> We can't put them back in the future.
>
> Having a different layout would force user-space to ifdef/duplicate any
> code
> that accesses this between architectures.
In x86 and arm64 user space code, the handling logic of
KVM_GET/SET_VCPU_EVENTS is in different ARCH folder, maybe it is not
necessary to share the handling code in the user space.

>
>
>
> The compiler will want that __u64 to be naturally aligned to 8-bytes, so
> your
> 4-byte padding still causes some secret compiler-padding to be inserted.
> Different versions of the compiler may put it in different places.
>
>
>> 4.32 KVM_SET_VCPU_EVENTS
>>
>> Capability: KVM_CAP_VCPU_EVENTS
>> Extended by: KVM_CAP_INTR_SHADOW
>> -Architectures: x86
>> +Architectures: x86, arm, arm64
>> Type: vm ioctl
>> Parameters: struct kvm_vcpu_event (in)
>> Returns: 0 on success, -1 on error
>>
>> +X86:
>> +
>> Set pending exceptions, interrupts, and NMIs as well as related states of
>> the
>> vcpu.
>>
>> @@ -894,6 +912,12 @@ shall be written into the VCPU.
>>
>> KVM_VCPUEVENT_VALID_SMM can only be set if KVM_CAP_X86_SMM is available.
>>
>> +ARM, ARM64:
>> +
>> +Set pending SError exceptions as well as related states of the vcpu.
>> +
>> +See KVM_GET_VCPU_EVENTS for the data structure.
>> +
>>
>> 4.33 KVM_GET_DEBUGREGS
>>
>
>
>> diff --git a/arch/arm64/include/uapi/asm/kvm.h
>> b/arch/arm64/include/uapi/asm/kvm.h
>> index 9abbf30..855cc9a 100644
>> --- a/arch/arm64/include/uapi/asm/kvm.h
>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>> @@ -39,6 +39,7 @@
>> #define __KVM_HAVE_GUEST_DEBUG
>> #define __KVM_HAVE_IRQ_LINE
>> #define __KVM_HAVE_READONLY_MEM
>> +#define __KVM_HAVE_VCPU_EVENTS
>>
>> #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
>>
>> @@ -153,6 +154,17 @@ struct kvm_sync_regs {
>> struct kvm_arch_memory_slot {
>> };
>>
>> +/* for KVM_GET/SET_VCPU_EVENTS */
>> +struct kvm_vcpu_events {
>> + struct {
>> + __u8 serror_pending;
>> + __u8 serror_has_esr;
>
>> + /* Align it to 4 bytes */
>> + __u8 pad[2];
>
> (padding noted above)
>
>
>> + __u64 serror_esr;
>> + } exception;
>> +};
>> +
>> /* If you need to interpret the index values, here is the key: */
>> #define KVM_REG_ARM_COPROC_MASK 0x000000000FFF0000
>> #define KVM_REG_ARM_COPROC_SHIFT 16
>
>
>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> index 5c7f657..42e1222 100644
>> --- a/arch/arm64/kvm/guest.c
>> +++ b/arch/arm64/kvm/guest.c
>> @@ -277,6 +277,37 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu
>> *vcpu,
>> return -EINVAL;
>> }
>>
>> +int kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu,
>> + struct kvm_vcpu_events *events)
>> +{
>> + events->exception.serror_pending = (vcpu_get_hcr(vcpu) & HCR_VSE);
>> + events->exception.serror_has_esr =
>> + cpus_have_const_cap(ARM64_HAS_RAS_EXTN) &&
>> + (!!vcpu_get_vsesr(vcpu));
>
>> + events->exception.serror_esr = vcpu_get_vsesr(vcpu);
>
> This will return a stale ESR even if nothing is pending. On systems without
> the
> RAS extensions it will return 'ESR_ELx_ISV' if kvm_inject_vabt() has ever
> been
> called for this CPU.
>
> Could we hide this behind (pending && has_esr), setting it to 0 otherwise.
> This
> is just to avoid exposing the stale value.
Exactly, it is indeed.

>
>
>> +
>> + return 0;
>> +}
>
>> +int kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
>> + struct kvm_vcpu_events *events)
>> +{
>> + bool injected = events->exception.serror_pending;
>> + bool has_esr = events->exception.serror_has_esr;
>> +
>> + if (injected && has_esr) {
>> + if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
>> + return -EINVAL;
>> +
>> + kvm_set_sei_esr(vcpu, events->exception.serror_esr);
>> +
>> + } else if (injected) {
>> + kvm_inject_vabt(vcpu);
>
> Nit: looks like 'injected' is misnamed.

"injected" change to "pending"?

>
>
>> + }
>> +
>> + return 0;
>> +}
>> +
>> int __attribute_const__ kvm_target_cpu(void)
>> {
>> unsigned long implementor = read_cpuid_implementor();
>
>
>> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
>> index 38c8a64..20e919a 100644
>> --- a/arch/arm64/kvm/reset.c
>> +++ b/arch/arm64/kvm/reset.c
>> @@ -82,6 +82,7 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm,
>> long ext)
>> break;
>> case KVM_CAP_SET_GUEST_DEBUG:
>> case KVM_CAP_VCPU_ATTRIBUTES:
>> + case KVM_CAP_VCPU_EVENTS:
>> r = 1;
>> break;
>> default:
>
>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>> index 7e3941f..945655d 100644
>> --- a/virt/kvm/arm/arm.c
>> +++ b/virt/kvm/arm/arm.c
>> @@ -1051,6 +1051,27 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>> return -EFAULT;
>> return kvm_arm_vcpu_has_attr(vcpu, &attr);
>> }
>> + case KVM_GET_VCPU_EVENTS: {
>> + struct kvm_vcpu_events events;
>> +
>> + memset(&events, 0, sizeof(struct kvm_vcpu_events));
>
> sizeof(events) is the normal style here, it means if someone changes
> event's
> type we don't get any surprises...

Ok, thanks

>
>
>> + if (kvm_arm_vcpu_get_events(vcpu, &events))
>> + return -EINVAL;
>> +
>> + if (copy_to_user(argp, &events, sizeof(struct kvm_vcpu_events)))
>
> sizeof(events)
thanks

>
>
>> + return -EFAULT;
>> +
>> + return 0;
>> + }
>> + case KVM_SET_VCPU_EVENTS: {
>> + struct kvm_vcpu_events events;
>> +
>> + if (copy_from_user(&events, argp,
>> + sizeof(struct kvm_vcpu_events)))
>> + return -EFAULT;
>> +
>> + return kvm_arm_vcpu_set_events(vcpu, &events);
>> + }
>> default:
>> return -EINVAL;
>> }
>>
>
> Despite KVM_CAP_VCPU_EVENTS not being advertised on 32bit, any attempt to
> call
> it will still end up in here, but will always fail as the {g,s}et_events()
> calls
> always return -EINVAL. I don't think this will cause us any problems.
What are the things you're worried about?

>
>
> Thanks,
>
> James
> _______________________________________________
> kvmarm mailing list
> kvmarm@xxxxxxxxxxxxxxxxxxxxx
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>