[RFC PATCH 00/24] KVM: SVM: Rework ASID management

From: Yosry Ahmed
Date: Wed Mar 26 2025 - 15:37:48 EST


This series reworks how SVM manages ASIDs by:
(a) Allocating a single static ASID for each L1 VM, instead of
dynamically allocating ASIDs. This simplifies the logic and allow
for more unifications between SVM and SEV, as the latter already
uses per-VM ASIDs as required for other purposes.

This is patches 1 to 10.

(b) Using a separate ASID for L2 VMs. Instead of using the same ASID for
L1 and L2 guests, and doing a TLB flush and MMU sync on every nested
transition, a separate ASID is used and TLB flushes are done
conditionally as needed.

This is patches 11 till the end.

The advantages of this are:
- Simplifying the logic by dropping dynamic ASID allocations.
- Unifying some logic between SVM and SEV, as the latter already uses
per-VM ASIDs as required for other purposes.
- Enabling INVLPGB virtualization [1].
- Improving the performance of nested guests by avoiding some TLB
flushes.

The series was tested by running a L2 and L3 Linux guests with some
simple workloads in them (mmap()/munmap() stress, netperf, etc). I also
ran the KVM selftests in both L0 and L1.

I believe some of the patches are in mergeable state, but this series is
still an RFC for a few reasons:
- I haven't done as much testing as I initially planned. Mainly I wanted
to test with a Windows guest running WSL to get Linux and Windows L2
VMs running side-by-side. I couldn't get it done due to some
testing infrastructure hiccups.

- The SEV changes are generally untested beyond build testing, and I
would like to get more feedback on them before moving forward. Namely,
I think there is room for further unification. SEV should probably use
the new kvm_tlb_tags infrastructure to allocate its ASIDs as well. The
way I think about it is by optionally having a bitmap of "pending"
ASIDs in kvm_tlb_tags, and make unused SEV ASIDs "pending" until we
run out of space and do the necessary flushes to make them free.

- I want to get general feedback about the direction this is heading in,
and things like generalizing the ASID tracking in SEV to work for SVM,
thoughts on using an xarray for that, etc.

- Some things can/should be cleaned up, although they can be followups
too. For example, the current logic will allocate a "normal" ASID for
an SEV VM upon creation, then allocate an SEV-friendly ASID to it when
SEV is initialized. The "normal" ASID remains allocated though, and
kvm_svm->asid and kvm_svm->sev_info.asid remain different. It seems
like we should not allocate the "normal" ASID to begin with, or free
it if the VM uses SEV. However, I am not sure what's the best way to
do any of this because I am not clear on the life cycle of a SEV VM.

This series started as two separate series, one to optimize nested TLB
flushes by using a separate ASID for L2 VMs [2], and one to use a single
ASID per-VM [3]. However, there is a lot of dependency and interaction
among both series that I think it's useful to combine them, at least for
now so that the big picture is clear. The series can be later split
again into 2 or more series, or merged incrementally.

I am sending this out now to get feedback, and also to "checkpoint" my
work as I won't be picking this up again for a few months. I will remain
able to respond to discussion and reviews, although at a lower capacity.
If anyone wants to pick up this series in the meantime, partially or
fully, please feel free to do so. Just let me know so that we can
coordinate.

Rik and Tom, I CC'd you due to the previous discussion you had with Sean
about INVLPGB virtualization. I can drop you from following versions if
you'd like to avoid the noise.

Here is a brief walkthrough of the series:

Part 1: Use a single ASID per-VM
- Patch 1 generalizes the VPID allocation into a generic kvm_tlb_tags
factory to be used by SVM.
- Patches 2-3 are cleanups and/or refactoring.
- Patches 4-5 get rid of the cases where we currently allocate a new
ASID dynamically by just flushing the existing ASID or falling back to
full flush if flushing an ASID is not supported.
- Patches 6-9 generalize SEV's per-CPU ASID -> vCPU tracking to make it
work for SVM.
- Patch 10 finally drops the dynamic ASID allocation logic and uses a
single per-VM ASID.

Part 2: Optimize nSVM TLB flushes
- Patch 11 starts by using a separate ASID for L2 guests, although
it is initially the same as the L1 ASID. It's essentially just laying
the groundwork.
- Patches 12 - 16 are refactoring groundwork.
- Patches 17 - 22 add the needed handling of the L2 ASID TLB flushing.
- Patch 23 starts allocating a new ASID for L2 as using the same ASID is
no longer needed.
- Patch 24 drops the unconditional TLB flushes on nested transitions,
which are no longer necessary after L2 is using a separate
well-maintained ASID.

Diff from the initial versions of series [2] and [3]:
- Generalized the SEV tracking of ASID->vCPU to use it for SVM, to make
sure the TLB is flushed when a new vCPU with the same ASID is run on
the same physical CPU.
- Made sure kvm_hv_vcpu_purge_flush_tlb() is handled correctly by
passing in is_guest_mode to purge the correct queue when doing L1 vs
L2 TLB flushes (Maxim).
- Improved the commentary in nested_svm_entry_tlb_flush() (Maxim).
- Handle INVLPGA from the guest even nested NPT is used (Maxim).
- Improved some commit logs.

[1]https://lore.kernel.org/all/Z8HdBg3wj8M7a4ts@xxxxxxxxxx/
[2]https://lore.kernel.org/lkml/20250205182402.2147495-1-yosry.ahmed@xxxxxxxxx/
[3]https://lore.kernel.org/lkml/20250313215540.4171762-1-yosry.ahmed@xxxxxxxxx/


Yosry Ahmed (24):
KVM: VMX: Generalize VPID allocation to be vendor-neutral
KVM: SVM: Use cached local variable in init_vmcb()
KVM: SVM: Add helpers to set/clear ASID flush in VMCB
KVM: SVM: Flush everything if FLUSHBYASID is not available
KVM: SVM: Flush the ASID when running on a new CPU
KVM: SEV: Track ASID->vCPU instead of ASID->VMCB
KVM: SEV: Track ASID->vCPU on vCPU load
KVM: SEV: Drop pre_sev_run()
KVM: SEV: Generalize tracking ASID->vCPU with xarrays
KVM: SVM: Use a single ASID per VM
KVM: nSVM: Use a separate ASID for nested guests
KVM: x86: hyper-v: Pass is_guest_mode to kvm_hv_vcpu_purge_flush_tlb()
KVM: nSVM: Parameterize svm_flush_tlb_asid() by is_guest_mode
KVM: nSVM: Split nested_svm_transition_tlb_flush() into entry/exit fns
KVM: x86/mmu: rename __kvm_mmu_invalidate_addr()
KVM: x86/mmu: Allow skipping the gva flush in
kvm_mmu_invalidate_addr()
KVM: nSVM: Flush both L1 and L2 ASIDs on KVM_REQ_TLB_FLUSH
KVM: nSVM: Handle nested TLB flush requests through TLB_CONTROL
KVM: nSVM: Flush the TLB if L1 changes L2's ASID
KVM: nSVM: Do not reset TLB_CONTROL in VMCB02 on nested entry
KVM: nSVM: Service local TLB flushes before nested transitions
KVM: nSVM: Handle INVLPGA interception correctly
KVM: nSVM: Allocate a new ASID for nested guests
KVM: nSVM: Stop bombing the TLB on nested transitions

arch/x86/include/asm/kvm_host.h | 2 +
arch/x86/include/asm/svm.h | 5 -
arch/x86/kvm/hyperv.h | 8 +-
arch/x86/kvm/mmu/mmu.c | 22 ++-
arch/x86/kvm/svm/nested.c | 68 ++++++---
arch/x86/kvm/svm/sev.c | 60 +-------
arch/x86/kvm/svm/svm.c | 257 +++++++++++++++++++++++---------
arch/x86/kvm/svm/svm.h | 43 ++++--
arch/x86/kvm/vmx/nested.c | 4 +-
arch/x86/kvm/vmx/vmx.c | 38 +----
arch/x86/kvm/vmx/vmx.h | 4 +-
arch/x86/kvm/x86.c | 60 +++++++-
arch/x86/kvm/x86.h | 13 ++
13 files changed, 378 insertions(+), 206 deletions(-)

--
2.49.0.395.g12beb8f557-goog