Re: [PATCH 3/3] KVM: x86: always stop emulation on page fault

From: Sean Christopherson
Date: Tue Aug 27 2019 - 10:50:33 EST

+Cc Peng Hao and Yi Wang

On Tue, Aug 27, 2019 at 01:07:09PM +0000, Jan Dakinevich wrote:
> inject_emulated_exception() returns true if and only if nested page
> fault happens. However, page fault can come from guest page tables
> walk, either nested or not nested. In both cases we should stop an
> attempt to read under RIP and give guest to step over its own page
> fault handler.
> Fixes: 6ea6e84 ("KVM: x86: inject exceptions produced by x86_decode_insn")
> Cc: Denis Lunev <den@xxxxxxxxxxxxx>
> Cc: Roman Kagan <rkagan@xxxxxxxxxxxxx>
> Cc: Denis Plotnikov <dplotnikov@xxxxxxxxxxxxx>
> Signed-off-by: Jan Dakinevich <jan.dakinevich@xxxxxxxxxxxxx>
> ---
> arch/x86/kvm/x86.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 93b0bd4..45caa69 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6521,8 +6521,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
> if (reexecute_instruction(vcpu, cr2, write_fault_to_spt,
> emulation_type))
> return EMULATE_DONE;
> - if (ctxt->have_exception && inject_emulated_exception(vcpu))
> + if (ctxt->have_exception) {
> + inject_emulated_exception(vcpu);
> return EMULATE_DONE;
> + }

Yikes, this patch and the previous have quite the sordid history.

The non-void return from inject_emulated_exception() was added by commit

ef54bcfeea6c ("KVM: x86: skip writeback on injection of nested exception")

for the purpose of skipping writeback. At the time, the above blob in the
decode flow didn't exist.

Decode exception handling was added by commit

6ea6e84309ca ("KVM: x86: inject exceptions produced by x86_decode_insn")

but it was dead code even then. The patch discussion[1] even point out that
it was dead code, i.e. the change probably should have been reverted.

Peng Hao and Yi Wang later ran into what appears to be the same bug you're
hitting[2][3], and even had patches temporarily queued[4][5], but the
patches never made it to mainline as they broke kvm-unit-tests. Fun side
note, Radim even pointed out[4] the bug fixed by patch 1/3.

So, the patches look correct, but there's the open question of why the
hypercall test was failing for Paolo. I've tried to reproduce the #DF to
no avail.


> if (emulation_type & EMULTYPE_SKIP)
> return EMULATE_FAIL;
> return handle_emulation_failure(vcpu, emulation_type);
> --
> 2.1.4