Re: [PATCH v2] KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously

From: Radim KrÄmÃÅ
Date: Thu Sep 14 2017 - 12:52:44 EST


2017-09-14 03:54-0700, Wanpeng Li:
> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>
> qemu-system-x86-8600 [004] d..1 7205.687530: kvm_entry: vcpu 2
> qemu-system-x86-8600 [004] .... 7205.687532: kvm_exit: reason EXCEPTION_NMI rip 0xffffffffa921297d info ffffeb2c0e44e018 80000b0e
> qemu-system-x86-8600 [004] .... 7205.687532: kvm_page_fault: address ffffeb2c0e44e018 error_code 0
> qemu-system-x86-8600 [004] .... 7205.687620: kvm_try_async_get_page: gva = 0xffffeb2c0e44e018, gfn = 0x427e4e
> qemu-system-x86-8600 [004] .N.. 7205.687628: kvm_async_pf_not_present: token 0x8b002 gva 0xffffeb2c0e44e018
> kworker/4:2-7814 [004] .... 7205.687655: kvm_async_pf_completed: gva 0xffffeb2c0e44e018 address 0x7fcc30c4e000
> qemu-system-x86-8600 [004] .... 7205.687703: kvm_async_pf_ready: token 0x8b002 gva 0xffffeb2c0e44e018
> qemu-system-x86-8600 [004] d..1 7205.687711: kvm_entry: vcpu 2
>
> After running some memory intensive workload in guest, I catch the kworker
> which completes the GUP too quickly, and queues an "Page Ready" #PF exception
> after the "Page not Present" exception before the next vmentry as the above
> trace which will result in #DF injected to guest.

The #DF feature can bite us in other cases as well, e.g. when emulating
an instruction that throws #GP/#UD.

Can't we replace all non-#PF exceptions with the PV #PF?
Doing so should be wrong only for trap exceptions and we currently just
override them anyway, so we wouldn't regress. :)

> This patch fixes it by clearing the queue for "Page not Present" if "Page Ready"
> occurs before the next vmentry since the GUP has already got the required page
> and shadow page table has already been fixed by "Page Ready" handler.
>
> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> Cc: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
> ---
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> @@ -8653,15 +8661,26 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
> kvm_del_async_pf_gfn(vcpu, work->arch.gfn);
> trace_kvm_async_pf_ready(work->arch.token, work->gva);
>
> - if ((vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED) &&
> - !apf_put_user(vcpu, KVM_PV_REASON_PAGE_READY)) {
> - fault.vector = PF_VECTOR;
> - fault.error_code_valid = true;
> - fault.error_code = 0;
> - fault.nested_page_fault = false;
> - fault.address = work->arch.token;
> - fault.async_page_fault = true;
> - kvm_inject_page_fault(vcpu, &fault);
> + if (vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED) {
> + if (!apf_get_user(vcpu, &val)) {

I removed one indentation level when applying by merging these two
condition.

> + if (val == KVM_PV_REASON_PAGE_NOT_PRESENT &&
> + vcpu->arch.exception.pending &&
> + vcpu->arch.exception.nr == PF_VECTOR &&
> + !apf_put_user(vcpu, 0)) {
> + vcpu->arch.exception.pending = false;

We know that vcpu->arch.exception.injected is false here, but I cleared
it too for safety, thanks.

> + vcpu->arch.exception.nr = 0;
> + vcpu->arch.exception.has_error_code = false;
> + vcpu->arch.exception.error_code = 0;
> + } else if (!apf_put_user(vcpu, KVM_PV_REASON_PAGE_READY)) {
> + fault.vector = PF_VECTOR;
> + fault.error_code_valid = true;
> + fault.error_code = 0;
> + fault.nested_page_fault = false;
> + fault.address = work->arch.token;
> + fault.async_page_fault = true;
> + kvm_inject_page_fault(vcpu, &fault);
> + }
> + }
> }
> vcpu->arch.apf.halted = false;
> vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
> --
> 2.7.4
>