Re: [PATCH] x86: kvm: Avoid guest page table walk when gpa_available is set

From: Radim KrÄmÃÅ
Date: Tue Apr 25 2017 - 10:04:06 EST


2017-04-24 17:14-0500, Brijesh Singh:
>> > /* GPA available (AMD only) */
>> > bool gpa_available;
>> > + gpa_t gpa_val;
>>
>> Can't we pass this information through function parameters?
>>
>> (I'd rather avoid intractable variables.)
>>
>
> I also wanted to avoid adding yet another variable but we can't depend on
> cr2 parameters passed into x86_emulate_instruction().
>
> The x86_emulate_instruction() function is called from two places:
>
> 1) handling the page-fault.
> pf_interception [svm.c]
> kvm_mmu_page_fault [mmu.c]
> x86_emulate_instruction [x86.c]
>
> 2) completing the IO/MMIO's from previous instruction decode
> kvm_arch_vcpu_ioctl_run
> complete_emulated_io
> emulate_instruction
> x86_emulate_instruction(vcpu, 0, emulation_type, NULL, 0)
>
> In #1, we are guaranteed that cr2 variable will contain a valid GPA but
> in #2, CR2 is set to zero.

We are setting up the completion in #1 x86_emulate_instruction(), where
the gpa (cr2) is available, so we could store the value while arming
vcpu->arch.complete_userspace_io.

emulator_read_write_onepage() already saves gpa in frag->gpa, which is
then passed into complete_emulated_mmio -- isn't that mechanism
sufficient?

>> > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> > @@ -4159,6 +4159,10 @@ static int handle_exit(struct kvm_vcpu *vcpu)
>> >
>> > vcpu->arch.gpa_available = (exit_code == SVM_EXIT_NPF);
>> >
>> > + /* On #NPF, exit_info_2 contain a valid GPA */
>> > + if (vcpu->arch.gpa_available)
>> > + vcpu->arch.gpa_val = svm->vmcb->control.exit_info_2;
>>
>> How is vcpu->arch.gpa_val used between here and the NPF handler?
>>
>
> handle_exit [svm.c]
> pf_interception [svm.c]
> /* it invokes the fault handler with CR2 = svm->vmcb->control.exit_info_2 */
> kvm_mmu_page_fault [mmu.c]
> x86_emulate_instruction [x86.c]
> emulator_read_write_onepage [x86.c]
> /*
> *this is where we walk the guest page table to translate
> * a GVA to GPA. If gpa_available is set then we use the
> * gpa_val instead of walking the pgtable.
> */

pf_interception is the NPF exit handler -- please move the setting
there, at least. handle_exit() is a hot path that shouldn't contain
code that isn't applicable to all exits.

Btw. we need some other guarantees to declare it as GPA (cr2 is GPA in
NPT exits, but might not be in other) ... isn't arch.mmu.direct_map a
condition we are interested in?

The other code uses it to interpret cr2 directly as gpa, so we might be
able to avoid setting the arch.gpa_available in a hot path too.

>> > @@ -5675,8 +5673,14 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
>> > }
>> >
>> > restart:
>> > - /* Save the faulting GPA (cr2) in the address field */
>> > - ctxt->exception.address = cr2;
>> > + /*
>> > + * Save the faulting GPA (cr2) in the address field
>> > + * NOTE: If gpa_available is set then gpa_val will contain a valid GPA
>> > + */
>> > + if (vcpu->arch.gpa_available)
>> > + ctxt->exception.address = vcpu->arch.gpa_val;
>> > + else
>> > + ctxt->exception.address = cr2;
>>
>> And related, shouldn't vcpu->arch.gpa_val be in cr2?
>>
>
> See my previous comment. In some cases CR2 may be set to zero
> (e.g when completing the instruction from previous io/mmio page-fault).
>
> If we are decide to add the gpa_val then we can remove above if
> statement from x86_emulate_instruction() and update emulator_read_write_onepage
> to use the vcpu->arch.gpa_val instead of exception->address.

Yeah, that would be nicer than setting exception->address at a random
place.

We could also pass the value as cr2 in all cases if it made something
better.

> if (vcpu->arch.gpa_available &&
> emulator_can_use_gpa(ctxt) &&
> (addr & ~PAGE_MASK) == (exception->address & ~PAGE_MASK)) {
> gpa = vcpu=>arch.gpa_val;
> ...
> ...
> }

If at all possible, I'd like to have the gpa passed with other relevant
data, instead of having it isolated like this ... and we can't manage
that, then at least good benchmark results to excuse the bad code.

Thanks.