Re: [PATCH v2] x86/mce: fix wrong no-return-ip logic in do_machine_check()
From: Borislav Petkov
Date: Tue Feb 23 2021 - 04:44:07 EST
On Tue, Feb 23, 2021 at 10:27:55AM +0800, Aili Yao wrote:
> When Guest access one address with UE error, it will exit guest mode,
> the host will do the recovery job, and then one SIGBUS is send to
> the VCPU and qemu will catch the signal, there is only address and
> error level no RIPV in signal, so qemu will assume RIPV is cleared and
> inject the error into guest OS.
Lemme see:
void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr)
/* If we get an action required MCE, it has been injected by KVM
* while the VM was running. An action optional MCE instead should
* be coming from the main thread, which qemu_init_sigbus identifies
* as the "early kill" thread.
*/
assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO);
...
kvm_mce_inject(cpu, paddr, code);
in that function:
if (code == BUS_MCEERR_AR) {
status |= MCI_STATUS_AR | 0x134;
mcg_status |= MCG_STATUS_EIPV;
} else {
status |= 0xc0;
mcg_status |= MCG_STATUS_RIPV;
}
That looks like a valid RIP bit to me. Then cpu_x86_inject_mce() gets
that mcg_status and injects it into the guest.
So I can't follow your claim - qemu does handle RIPV just fine, it
seems.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette