Re: [RFC] AMD VM crashing on deferred memory error injection

From: Yazen Ghannam

Date: Thu Feb 12 2026 - 14:31:31 EST


On Thu, Feb 12, 2026 at 04:36:47PM +0100, William Roche wrote:
> On 2/11/26 17:34, Yazen Ghannam wrote:
> > On Wed, Feb 11, 2026 at 02:42:07AM +0100, William Roche wrote:
> > > On 2/9/26 22:18, Yazen Ghannam wrote:
> > > > On Mon, Feb 09, 2026 at 04:08:19PM -0500, Yazen Ghannam wrote:
> > > > [...]
> > > > Linux:
> > > > arch/x86/kvm/x86.c : set_msr_mce()
> > > >
> > > > Please note the comment:
> > > > "All CPUs allow writing 0 to MCi_STATUS MSRs to clear the MSR."
> > > >
> > > > We should include the MCA_DESTAT register range here.
> > > >
> > > > What do you think?
> > >
> > > But before trying to update the set_msr_mce() function, I don't think
> > > that KVM keeps track of an MSR_AMD64_SMCA_MCx_DESTAT set of registers.
> > > I can see mce_banks (for ctl, status, addr and misc) and mci_ctl2_banks
> > > locations in struct kvm_vcpu_arch, but I don't see a location for SMCA
> > > banks like MCA_DESTAT MSRs.
> > >
> > > So if we make kvm ignore this update instead of raising a #GP error,
> > > would it be a valid solution ?
> > >
> >
> > Yes, I think so. And the details depend on how much of the platform
> > needs to be emulated.
> >
> > Some ideas in increasing order of complexity:
> >
> > 1) Ignore this register write.
> >
> > 2) Do a basic validity check.
> > Allow "write 0 to MCA_DESTAT" and #GP for any other value.
> > Don't need to save MCA_DESTAT values.
> >
> > 3) Replicate the full platform behavior akin to MCi_STATUS.
> > Would need to save MCA_DESTAT values and do a "HWCR" bit check.
> >
> > I really don't think we want #3. This would useful for "register-based
> > error injection/simulation"r. But that use case wouldn't do much with the
> > MCA_DESTAT register without all the related Deferred error interrupt
> > infrastructure.
> >
> > So I say the choice is between #1 and #2.
>
>
> Thinking more about the problem introduced by your commit, I realized
> that only SMCA systems have MCA_DESTAT registers. So we should not
> allow access to this register from a non SMCA machine.
> And Qemu AMD VM is an example of a non SMCA machine !
>

So the SMCA CPUID bit is not advertised in this model?

> So according to me, modifying the hypervisor kvm to allow the access
> to MCA_DESTAT is clearly not the right move.
>
> We probably should implement an entire SMCA stack for Qemu, but this
> is another topic...
> For the moment, Borislav Petklov was right when he said that kvm works
> as advertised. The problem that your fix introduced is that is tries to
> access SMCA only registers on non SMCA machine.
>
> Do you agree on this aspect ?
>

Yes, I agree.

AMD systems generally have a Read-as-Zero/Writes-Ignored behavior when
accessing unimplemented MCA registers. But this requires the system to
recognize the register space.

In this case, the register space is totally unknown to the system, so it
responds with a #GP.

> If yes, than the correct change is to test if we are on an SMCA machine
> before accessing this register, like:
>
> diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
> index 3f1dda355307..8664ba048a62 100644
> --- a/arch/x86/kernel/cpu/mce/amd.c
> +++ b/arch/x86/kernel/cpu/mce/amd.c
> @@ -875,14 +875,18 @@ void amd_clear_bank(struct mce *m)
> {
> amd_reset_thr_limit(m->bank);
>
> - /* Clear MCA_DESTAT for all deferred errors even those logged in
> MCA_STATUS. */
> - if (m->status & MCI_STATUS_DEFERRED)
> - mce_wrmsrq(MSR_AMD64_SMCA_MCx_DESTAT(m->bank), 0);
> -
> - /* Don't clear MCA_STATUS if MCA_DESTAT was used exclusively. */
> - if (m->kflags & MCE_CHECK_DFR_REGS)
> - return;
> + if (mce_flags.smca) {
> + /*
> + * Clear MCA_DESTAT for all deferred errors even those
> + * logged in MCA_STATUS.
> + */
> + if (m->status & MCI_STATUS_DEFERRED)
> + mce_wrmsrq(MSR_AMD64_SMCA_MCx_DESTAT(m->bank), 0);
>
> + /* Don't clear MCA_STATUS if MCA_DESTAT was used
> exclusively. */
> + if (m->kflags & MCE_CHECK_DFR_REGS)
> + return;
> + }
> mce_wrmsrq(mca_msr_reg(m->bank, MCA_STATUS), 0);
> }
>
>
> I haven't noticed any obvious other non SMCA limitation in the other
> changes of this series, but if you think about any other case, we can
> probably fix all of them together.
>
> If you agree with this change I can submit it as a formal PATCH.
>

I think this change is fair. It could be minimized further by adding the
SMCA check to the status bit check for the WRMSR step.

if (mce_flags.smca && (m->status & MCI_STATUS_DEFERRED))
mce_wrmsrq(MSR_AMD64_SMCA_MCx_DESTAT(m->bank), 0);

Thanks,
Yazen