Re: [RFC] AMD VM crashing on deferred memory error injection

From: William Roche

Date: Thu Feb 12 2026 - 10:37:41 EST


On 2/11/26 17:34, Yazen Ghannam wrote:
On Wed, Feb 11, 2026 at 02:42:07AM +0100, William Roche wrote:
On 2/9/26 22:18, Yazen Ghannam wrote:
On Mon, Feb 09, 2026 at 04:08:19PM -0500, Yazen Ghannam wrote:
[...]
Linux:
arch/x86/kvm/x86.c : set_msr_mce()

Please note the comment:
"All CPUs allow writing 0 to MCi_STATUS MSRs to clear the MSR."

We should include the MCA_DESTAT register range here.

What do you think?

But before trying to update the set_msr_mce() function, I don't think
that KVM keeps track of an MSR_AMD64_SMCA_MCx_DESTAT set of registers.
I can see mce_banks (for ctl, status, addr and misc) and mci_ctl2_banks
locations in struct kvm_vcpu_arch, but I don't see a location for SMCA
banks like MCA_DESTAT MSRs.

So if we make kvm ignore this update instead of raising a #GP error,
would it be a valid solution ?


Yes, I think so. And the details depend on how much of the platform
needs to be emulated.

Some ideas in increasing order of complexity:

1) Ignore this register write.

2) Do a basic validity check.
Allow "write 0 to MCA_DESTAT" and #GP for any other value.
Don't need to save MCA_DESTAT values.

3) Replicate the full platform behavior akin to MCi_STATUS.
Would need to save MCA_DESTAT values and do a "HWCR" bit check.

I really don't think we want #3. This would useful for "register-based
error injection/simulation"r. But that use case wouldn't do much with the
MCA_DESTAT register without all the related Deferred error interrupt
infrastructure.

So I say the choice is between #1 and #2.


Thinking more about the problem introduced by your commit, I realized
that only SMCA systems have MCA_DESTAT registers. So we should not
allow access to this register from a non SMCA machine.
And Qemu AMD VM is an example of a non SMCA machine !

So according to me, modifying the hypervisor kvm to allow the access
to MCA_DESTAT is clearly not the right move.

We probably should implement an entire SMCA stack for Qemu, but this
is another topic...
For the moment, Borislav Petklov was right when he said that kvm works
as advertised. The problem that your fix introduced is that is tries to
access SMCA only registers on non SMCA machine.

Do you agree on this aspect ?

If yes, than the correct change is to test if we are on an SMCA machine
before accessing this register, like:

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 3f1dda355307..8664ba048a62 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -875,14 +875,18 @@ void amd_clear_bank(struct mce *m)
{
amd_reset_thr_limit(m->bank);

- /* Clear MCA_DESTAT for all deferred errors even those logged in MCA_STATUS. */
- if (m->status & MCI_STATUS_DEFERRED)
- mce_wrmsrq(MSR_AMD64_SMCA_MCx_DESTAT(m->bank), 0);
-
- /* Don't clear MCA_STATUS if MCA_DESTAT was used exclusively. */
- if (m->kflags & MCE_CHECK_DFR_REGS)
- return;
+ if (mce_flags.smca) {
+ /*
+ * Clear MCA_DESTAT for all deferred errors even those
+ * logged in MCA_STATUS.
+ */
+ if (m->status & MCI_STATUS_DEFERRED)
+ mce_wrmsrq(MSR_AMD64_SMCA_MCx_DESTAT(m->bank), 0);

+ /* Don't clear MCA_STATUS if MCA_DESTAT was used exclusively. */
+ if (m->kflags & MCE_CHECK_DFR_REGS)
+ return;
+ }
mce_wrmsrq(mca_msr_reg(m->bank, MCA_STATUS), 0);
}


I haven't noticed any obvious other non SMCA limitation in the other
changes of this series, but if you think about any other case, we can
probably fix all of them together.

If you agree with this change I can submit it as a formal PATCH.

Thanks in advance for your feedback.
William.