Re: [PATCH v3 0/6] KVM: SVM: Fix DEBUGCTL bugs
From: Maxim Levitsky
Date: Wed Apr 09 2025 - 16:53:03 EST
On Tue, 2025-04-08 at 15:43 -0700, Sean Christopherson wrote:
> On Tue, Apr 01, 2025, Maxim Levitsky wrote:
> > On Thu, 2025-02-27 at 14:24 -0800, Sean Christopherson wrote:
> > > Fix a long-lurking bug in SVM where KVM runs the guest with the host's
> > > DEBUGCTL if LBR virtualization is disabled. AMD CPUs rather stupidly
> > > context switch DEBUGCTL if and only if LBR virtualization is enabled (not
> > > just supported, but fully enabled).
> > >
> > > The bug has gone unnoticed because until recently, the only bits that
> > > KVM would leave set were things like BTF, which are guest visible but
> > > won't cause functional problems unless guest software is being especially
> > > particular about #DBs.
> > >
> > > The bug was exposed by the addition of BusLockTrap ("Detect" in the kernel),
> > > as the resulting #DBs due to split-lock accesses in guest userspace (lol
> > > Steam) get reflected into the guest by KVM.
> > >
> > > Note, I don't love suppressing DEBUGCTL.BTF, but practically speaking that's
> > > likely the behavior that SVM guests have gotten the vast, vast majority of
> > > the time, and given that it's the behavior on Intel, it's (hopefully) a safe
> > > option for a fix, e.g. versus trying to add proper BTF virtualization on the
> > > fly.
> > >
> > > v3:
> > > - Suppress BTF, as KVM doesn't actually support it. [Ravi]
> > > - Actually load the guest's DEBUGCTL (though amusingly, with BTF squashed,
> > > it's guaranteed to be '0' in this scenario). [Ravi]
> > >
> > > v2:
> > > - Load the guest's DEBUGCTL instead of simply zeroing it on VMRUN.
> > > - Drop bits 5:3 from guest DEBUGCTL so that KVM doesn't let the guest
> > > unintentionally enable BusLockTrap (AMD repurposed bits). [Ravi]
> > > - Collect a review. [Xiaoyao]
> > > - Make bits 5:3 fully reserved, in a separate not-for-stable patch.
> > >
> > > v1: https://lore.kernel.org/all/20250224181315.2376869-1-seanjc@xxxxxxxxxx
> > >
> >
> > Hi,
> >
> > Amusingly there is another DEBUGCTL issue, which I just got to the bottom of.
> > (if I am not mistaken of course).
> >
> > We currently don't let the guest set DEBUGCTL.FREEZE_WHILE_SMM and neither
> > set it ourselves in GUEST_IA32_DEBUGCTL vmcs field, even when supported by the host
> > (If I read the code correctly, I didn't verify this in runtime)
>
> Ugh, SMM. Yeah, KVM doesn't propagate DEBUGCTLMSR_FREEZE_IN_SMM to the guest
> value. KVM intercepts reads and writes to DEBUGCTL, so it should be easy enough
> to shove the bit in on writes, and drop it on reads.
>
> > This means that the host #SMIs will interfere with the guest PMU. In
> > particular this causes the 'pmu' kvm-unit-test to fail, which is something
> > that our CI caught.
> >
> > I think that kvm should just set this bit, or even better, use the host value
> > of this bit, and hide it from the guest, because the guest shouldn't know
> > about host's smm, and we AFAIK don't really support freezing perfmon when the
> > guest enters its own emulated SMM.
>
> Agreed. Easy thing is to use the host's value, so that KVM doesn't need to check
> for its existence. I can't think of anything that would go sideways by freezing
> perfmon if the host happens to take an SMI.
>
> > What do you think? I'll post patches if you think that this is a good idea.
> > (A temp hack to set this bit always in GUEST_IA32_DEBUGCTL fixed the problem for me)
> >
> > I also need to check if AMD also has this feature, or if this is Intel specific.
>
> Intel only. I assume/think/hope AMD's Host/Guest Only field in the event selector
> effectively hides SMM from the guest.
>
Hi,
I will post a patch soon then. I just got my hands on the CI machine where the test failed
and yes, the machine receives about 8 #SMIs per second on each core. Oh well...
BTW pmu_counters_test selftest is also affected since it counts # of retired instructions.
With #SMI getting in the way, the number of course soars.
It doesn't fail often at this rate but it does when the test test is done for sufficient
number or times or you just get lucky.
Best regards,
Maxim Levitsky