Re: [PATCH v9 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
From: Jim Mattson
Date: Tue Apr 07 2026 - 14:43:35 EST
On Tue, Apr 7, 2026 at 10:12 AM Pawan Gupta
<pawan.kumar.gupta@xxxxxxxxxxxxxxx> wrote:
>
> On Tue, Apr 07, 2026 at 09:46:07AM -0700, Jim Mattson wrote:
> > On Tue, Apr 7, 2026 at 9:40 AM Pawan Gupta
> > <pawan.kumar.gupta@xxxxxxxxxxxxxxx> wrote:
> > >
> > > On Mon, Apr 06, 2026 at 07:23:25AM -0700, Jim Mattson wrote:
> > > > Yes, but the guest needs a way to determine whether the hypervisor
> > > > will do what's necessary to make the short sequence effective. And, in
> > > > particular, no KVM hypervisor today is prepared to do that.
> > > >
> > > > When running under a hypervisor, without BHI_CTRL and without any
> > > > evidence to the contrary, the guest must assume that the longer
> > > > sequence is necessary. At the very least, we need a CPUID or MSR bit
> > > > that says, "the short BHB clearing sequence is adequate for this
> > > > vCPU."
> > >
> > > After discussing this internally, the consensus is that the best path
> > > forward is to add virtual SPEC_CTRL support to KVM, which also aligns with
> > > Intel's guidance. In the long term, virtual SPEC_CTRL can benefit future
> > > mitigations as well. As with many other mitigations (e.g. microcode), the
> > > guest would rely on the host to enforce the appropriate protections.
> >
> > I don't think it's reasonable for the guest to rely on a future
> > implementation to enforce the appropriate protections.
> >
> > This is already a problem today. If a guest sees that BHI_CTRL is
> > unavailable, it will deploy the short BHB clearing sequence and
> > declare that the vulnerability is mitigated. That isn't true if the
> > guest is running on Alder Lake or newer.
>
> In any case, there is a change required in the kernel either for the guest
> or the host, they both are future implementations. Why not implement the
> one that is more future proof.
There will always be old hypervisors. True future-proofing requires
that the guest be able to distinguish an old hypervisor from a new
one.
My proposal is as follows:
1. The (advanced) hypervisor can advertise to the guest (via CPUID bit
or MSR bit) that the short BHB clearing sequence is adequate. This may
mean either that the VM will only be hosted on pre-Alder Lake hardware
or that the hypervisor will set BHI_DIS_S behind the back of the
guest. Presumably, this bit would not be reported if BHI_CTRL is
advertised to the guest.
2. If the guest sees this bit, then it can use the short sequence. If
it doesn't see this bit, it must use the long sequence.