Re: [PATCH] x86/bugs: Adjust SRSO mitigation to new features

From: Sean Christopherson
Date: Tue Nov 05 2024 - 13:10:31 EST


On Tue, Nov 05, 2024, Borislav Petkov wrote:
> On Mon, Nov 04, 2024 at 04:57:20PM -0800, Sean Christopherson wrote:
> > scripts/get_maintainer.pl :-)
>
> That's what I used but I pruned the list.
>
> Why, did I miss anyone?

All of the actual maintainers. AFAIK, Paolo doesn't subscribe to kvm@.

> > But why do this in KVM? E.g. why not set-and-forget in init_amd_zen4()?
>
> Because there's no need to impose an unnecessary - albeit small - perf impact
> on users who don't do virt.
>
> I'm currently gravitating towards the MSR toggling thing, i.e., only when the
> VMs number goes 0=>1 but I'm not sure. If udev rules *always* load kvm.ko then
> yes, the toggling thing sounds better. I.e., set it only when really needed.
>
> > Shouldn't these be two separate patches? AFAICT, while the two are related,
> > there are no strict dependencies between SRSO_USER_KERNEL_NO and
> > SRSO_MSR_FIX.
>
> Meh, I can split them if you really want me to.

I do.

> > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > > index 9df3e1e5ae81..03f29912a638 100644
> > > --- a/arch/x86/kvm/svm/svm.c
> > > +++ b/arch/x86/kvm/svm/svm.c
> > > @@ -608,6 +608,9 @@ static void svm_disable_virtualization_cpu(void)
> > > kvm_cpu_svm_disable();
> > >
> > > amd_pmu_disable_virt();
> > > +
> > > + if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX))
> > > + msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT);
> >
> > I don't like assuming the state of hardware. E.g. if MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT
> > was already set, then KVM shouldn't clear it.
>
> Right, I don't see that happening tho. If we have to sync the toggling of this
> bit between different places, we'll have to do some dance but so far its only
> user is KVM.
>
> > KVM's usual method of restoring host MSRs is to snapshot the MSR into
> > "struct kvm_host_values" on module load, and then restore from there as
> > needed. But that assumes all CPUs have the same value, which might not be
> > the case here?
>
> Yes, the default value is 0 out of reset and it should be set on each logical
> CPU whenever we run VMs on it. I'd love to make it part of the VMRUN microcode
> but... :-)
>
> > All that said, I'd still prefer that MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT is set
> > during boot, unless there's a good reason not to do so.
>
> Yeah, unnecessary penalty on machines not running virt.

What does the bit actually do? I can't find any useful documentation, and the
changelog is equally useless.