Re: [PATCH v8 00/10] VMSCAPE optimization for BHI variant

From: Jon Kohler

Date: Sun Mar 29 2026 - 23:18:36 EST

> On Mar 24, 2026, at 2:16 PM, Pawan Gupta <pawan.kumar.gupta@xxxxxxxxxxxxxxx> wrote:
>
> v8:
> - Use helper in KVM to convey the mitigation status. (PeterZ/Borisov)
> - Fix the documentation for default vmscape mitigation. (BPF bot)
> - Remove the stray lines in bug.c (BPF bot).
> - Updated commit messages and comments.
> - Rebased to v7.0-rc5.
>
> v7: https://lore.kernel.org/r/20260319-vmscape-bhb-v7-0-b76a777a98af@xxxxxxxxxxxxxxx
> - s/This allows/Allow/ and s/This does adds/This adds/ in patch 1/10 commit
> message (Borislav).
> - Minimize register usage in BHB clearing seq. (David Laight)
> - Instead of separate ecx/eax counters, use al/ah.
> - Adjust the alignment of RET due to register size change.
> - save/restore rax in the seq itself.
> - Remove the save/restore of rax/rcx for BPF callers.
> - Rename clear_bhb_loop() to clear_bhb_loop_nofence() to make it
> obvious that the LFENCE is not part of the sequence (Borislav).
> - Fix Kconfig: s/select/depends on/ HAVE_STATIC_CALL (PeterZ).
> - Rebased to v7.0-rc4.
>
> v6: https://lore.kernel.org/r/20251201-vmscape-bhb-v6-0-d610dd515714@xxxxxxxxxxxxxxx
> - Remove semicolon at the end of asm in ALTERNATIVE (Uros).
> - Fix build warning in vmscape_select_mitigation() (LKP).
> - Rebased to v6.18.
>
> v5: https://lore.kernel.org/r/20251126-vmscape-bhb-v5-2-02d66e423b00@xxxxxxxxxxxxxxx
> - For BHI seq, limit runtime-patching to loop counts only (Dave).
> Dropped 2 patches that moved the BHB seq to a macro.
> - Remove redundant switch cases in vmscape_select_mitigation() (Nikolay).
> - Improve commit message (Nikolay).
> - Collected tags.
>
> v4: https://lore.kernel.org/r/20251119-vmscape-bhb-v4-0-1adad4e69ddc@xxxxxxxxxxxxxxx
> - Move LFENCE to the callsite, out of clear_bhb_loop(). (Dave)
> - Make clear_bhb_loop() work for larger BHB. (Dave)
> This now uses hardware enumeration to determine the BHB size to clear.
> - Use write_ibpb() instead of indirect_branch_prediction_barrier() when
> IBPB is known to be available. (Dave)
> - Use static_call() to simplify mitigation at exit-to-userspace. (Dave)
> - Refactor vmscape_select_mitigation(). (Dave)
> - Fix vmscape=on which was wrongly behaving as AUTO. (Dave)
> - Split the patches. (Dave)
> - Patch 1-4 prepares for making the sequence flexible for VMSCAPE use.
> - Patch 5 trivial rename of variable.
> - Patch 6-8 prepares for deploying BHB mitigation for VMSCAPE.
> - Patch 9 deploys the mitigation.
> - Patch 10-11 fixes ON Vs AUTO mode.
>
> v3: https://lore.kernel.org/r/20251027-vmscape-bhb-v3-0-5793c2534e93@xxxxxxxxxxxxxxx
> - s/x86_pred_flush_pending/x86_predictor_flush_exit_to_user/ (Sean).
> - Removed IBPB & BHB-clear mutual exclusion at exit-to-userspace.
> - Collected tags.
>
> v2: https://lore.kernel.org/r/20251015-vmscape-bhb-v2-0-91cbdd9c3a96@xxxxxxxxxxxxxxx
> - Added check for IBPB feature in vmscape_select_mitigation(). (David)
> - s/vmscape=auto/vmscape=on/ (David)
> - Added patch to remove LFENCE from VMSCAPE BHB-clear sequence.
> - Rebased to v6.18-rc1.
>
> v1: https://lore.kernel.org/r/20250924-vmscape-bhb-v1-0-da51f0e1934d@xxxxxxxxxxxxxxx
>
> Hi All,
>
> These patches aim to improve the performance of a recent mitigation for
> VMSCAPE[1] vulnerability. This improvement is relevant for BHI variant of
> VMSCAPE that affect Alder Lake and newer processors.
>
> The current mitigation approach uses IBPB on kvm-exit-to-userspace for all
> affected range of CPUs. This is an overkill for CPUs that are only affected
> by the BHI variant. On such CPUs clearing the branch history is sufficient
> for VMSCAPE, and also more apt as the underlying issue is due to poisoned
> branch history.
>
> Below is the iPerf data for transfer between guest and host, comparing IBPB
> and BHB-clear mitigation. BHB-clear shows performance improvement over IBPB
> in most cases.
>
> Platform: Emerald Rapids
> Baseline: vmscape=off
> Target: IBPB at VMexit-to-userspace Vs the new BHB-clear at
> VMexit-to-userspace mitigation (both compared against baseline).
>
> (pN = N parallel connections)
>
> | iPerf user-net | IBPB | BHB Clear |
> |----------------|---------|-----------|
> | UDP 1-vCPU_p1 | -12.5% | 1.3% |
> | TCP 1-vCPU_p1 | -10.4% | -1.5% |
> | TCP 1-vCPU_p1 | -7.5% | -3.0% |
> | UDP 4-vCPU_p16 | -3.7% | -3.7% |
> | TCP 4-vCPU_p4 | -2.9% | -1.4% |
> | UDP 4-vCPU_p4 | -0.6% | 0.0% |
> | TCP 4-vCPU_p4 | 3.5% | 0.0% |
>
> | iPerf bridge-net | IBPB | BHB Clear |
> |------------------|---------|-----------|
> | UDP 1-vCPU_p1 | -9.4% | -0.4% |
> | TCP 1-vCPU_p1 | -3.9% | -0.5% |
> | UDP 4-vCPU_p16 | -2.2% | -3.8% |
> | TCP 4-vCPU_p4 | -1.0% | -1.0% |
> | TCP 4-vCPU_p4 | 0.5% | 0.5% |
> | UDP 4-vCPU_p4 | 0.0% | 0.9% |
> | TCP 1-vCPU_p1 | 0.0% | 0.9% |
>
> | iPerf vhost-net | IBPB | BHB Clear |
> |-----------------|---------|-----------|
> | UDP 1-vCPU_p1 | -4.3% | 1.0% |
> | TCP 1-vCPU_p1 | -3.8% | -0.5% |
> | TCP 1-vCPU_p1 | -2.7% | -0.7% |
> | UDP 4-vCPU_p16 | -0.7% | -2.2% |
> | TCP 4-vCPU_p4 | -0.4% | 0.8% |
> | UDP 4-vCPU_p4 | 0.4% | -0.7% |
> | TCP 4-vCPU_p4 | 0.0% | 0.6% |
>
> [1] https://comsec.ethz.ch/research/microarch/vmscape-exposing-and-exploiting-incomplete-branch-predictor-isolation-in-cloud-environments/
> ---
> Pawan Gupta (10):
> x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop()
> x86/bhi: Make clear_bhb_loop() effective on newer CPUs
> x86/bhi: Rename clear_bhb_loop() to clear_bhb_loop_nofence()
> x86/vmscape: Rename x86_ibpb_exit_to_user to x86_predictor_flush_exit_to_user
> x86/vmscape: Move mitigation selection to a switch()
> x86/vmscape: Use write_ibpb() instead of indirect_branch_prediction_barrier()
> x86/vmscape: Use static_call() for predictor flush
> x86/vmscape: Deploy BHB clearing mitigation
> x86/vmscape: Resolve conflict between attack-vectors and vmscape=force
> x86/vmscape: Add cmdline vmscape=on to override attack vector controls
>
> Documentation/admin-guide/hw-vuln/vmscape.rst | 15 ++++-
> Documentation/admin-guide/kernel-parameters.txt | 6 +-
> arch/x86/Kconfig | 1 +
> arch/x86/entry/entry_64.S | 34 +++++++----
> arch/x86/include/asm/cpufeatures.h | 2 +-
> arch/x86/include/asm/entry-common.h | 9 ++-
> arch/x86/include/asm/nospec-branch.h | 13 +++--
> arch/x86/include/asm/processor.h | 1 +
> arch/x86/kernel/cpu/bugs.c | 76 ++++++++++++++++++++-----
> arch/x86/kvm/x86.c | 4 +-
> arch/x86/net/bpf_jit_comp.c | 11 +---
> 11 files changed, 127 insertions(+), 45 deletions(-)
> ---
> base-commit: c369299895a591d96745d6492d4888259b004a9e
> change-id: 20250916-vmscape-bhb-d7d469977f2f
>
> Best regards,
> --
> Thanks,
> Pawan

Tested the v7 of this series with 6.18.y and one of our performance
suites, where we had previously bisected a significant regression to
the enablement of the VMSCAPE mitigation. This particular suite looks
at synthetic performance using KVM virtualized Windows guests.

Long story short, this suite tries to derive what end user experience
would be in these virtual machines while performing a standardized set
of synthetic tasks on real apps.

VMSCAPE hits especially hard when enabling Windows HVCI, which drives
a much higher VMExit count, all else equals.

Tested on an Intel Xeon 6444Y (SPR)

TLDR, we're really happy with the results. The following was with
Intel MBEC *enabled*, so even with that speedup (and drastic reduction
in VMExits), this optimization makes a significant difference.

- CPU‑ready time drops ~70 % across all steady‑state and log‑on metrics
with this series, indicating more efficient context switching even
though overall hypervisor CPU rises ~14 % (steady) to ~12 % (max).
Basically, we're getting more actual work done.
- Read/write IOPS increase by ~18–37 % and 14–20 % respectively, while
average IO latency remains largely unchanged or slightly lower in
steady metrics.
- Power consumption falls 5–11 % in every category
- Login times improve by 4–6 % on average.
- Application start‑up times are generally better (Word, Excel,
PowerPoint, Outlook), especially Outlook max time drops 67 %, a clear
win for end‑user experience.

Tested-By: Jon Kohler <jon@xxxxxxxxxxx>